Are you the publisher? Claim or contact us about this channel


Embed this content in your HTML

Search

Report adult content:

click to rate:

Account: (login)

More Channels


Channel Catalog


    0 0
  • 02/06/15--16:00: BetaCode for Arabic
  • The latest version is on GitHub @ github.com/maximromanov/ArabicBetacode

    Minor update to the scheme (2015-03-09:10-21)

    Done to avoid issues with Alpheios translation alignment, which automatically splits supplied texts into words. Essentially, combinations with “.” and “:” are replaced with “*” and “=” respectively.

    • =t is tāʾ marbūṭaŧ
    • *s is ṣād (and the same for other letters transliterated with dots)

    Why BetaCode?

    Although both Windows and Mac OS now support Arabic, it is still quite difficult to type and edit Arabic texts. It is particularly frustrating to edit and manipulate fully vocalized texts, since most fonts either render “short vowels” (ḥarakāt) invisible, or do not render them properly. Because of the “stacking,” i.e. “short vowels” being placed on top of letters and on top of each other, it becomes impossible to edit texts and one is often forced to go into delete-and-retype mode (and there is still no guarantee, because of visual issues, that all the letters and “short vowels” will actually be in the right order). betaCode can make it easy to type fully-vocalized Arabic texts on any machine through the use of simple character combinations and automatic rendering into various transliteration schemes and the Arabic script (scroll below for examples).

    betaCode is first converted into a one-to-one transliteration scheme, which combines conventions from various academic transliteration schemes. Such scheme is necessary, since none of the existing academic schemes (American/Library of Congress, British, French, German, etc.) allow representing Arabic text unambiguously for computational purposes. Arabic betaCode transliteration can be then converted into any transliteration convention. At the moment the following schemes are implemented:

    • Library of Congress Romanization of Arabic
    • Simplified transliteration (LOC without diacritics)
    • Arabic script (the rules of hamzaŧ orthography are implemented, but may require some additional testing)

    NB: The idea of betaCode is borrowed from the Classicists who developed “a method of representing, using only ASCII characters, characters and formatting found in ancient Greek texts”. The current betaCode is inspired by, and is therefore quite similar to, the arabTex scheme. Linguists working with Arabic are commonly using Buckwalter transliteration, which is very similar to the current betaCode, but less readable.

    betaCode and One-To-One Transliteration

    : betacode : : translit : : Arabic letter :
    _a ā alif
    b b bāʾ
    t t tāʾ
    _t thāʾ
    ^g, j ǧ jīm
    *h ḥāʾ
    _h khāʾ
    d d dāl
    _d dhāl
    r r rā’
    z z zayn
    s s sīn
    ^s š shīn
    *s ṣād
    *d ḍād
    *t ṭāʾ
    *z ẓāʾ
    ` ʿ ‘ayn
    *g ġ ghayn
    f f fāʾ
    *k, q qāf
    k k kāf
    l l lām
    m m mīm
    n n nūn
    h h hā’
    w w wāw
    _u ū wāw
    y y yāʾ
    _i ī yāʾ

    Non-alphabetic letters

    : betacode : : translit : : Arabic :
    ʾ ḥamzaŧ
    /a á alif maqṣūraŧ
    =t ŧ tāʾ marbūṭaŧ

    Vowels

    : betacode : : translit : : Arabic :
    ~a ã dagger alif
    u u ḍammaŧ
    i i kasraŧ
    a a fatḥaŧ
    *n ȵ n of tanwīn
    *a å silent alif
    *w ů silent wāw
    ?u final ḍammaŧ *
    ?i final kasraŧ *
    ?a final fatḥaŧ *

    Basic principles:

    Every Arabic letter is betaCoded with its one-letter equivalent, preceded (if necessary) with a technical character that is similar to a diacritical mark in the transliterated version. Thus, most common symbols are as follows:

    General

    • _ (underscore), if a letter can be transliterated with macron/breve below or above (ā, , , , ū, ī)
    • . (period), or * (asterisk), if a letter can be transliterated transliterated with dot below or above (, , , , , ġ, )
    • ^ (caret), if a letter can be transliterated with caron (ǧ, š)

    Specifics

    • attached prepositions/conjunctions and pronominal suffixes must be separated with “-” (mostly relevant for text alignment, treebanking, and general readability):
      • bi-Llah?i
      • fa-_dahaba
    • add “?” before “optional” final vowels that are usually dropped in transliteration and pronounciation (mostly relevant for transliteration):
      • bi-Llah?i , but not:
      • fa-_dahaba
    • tāʾ marbūṭaŧ: add “+” after tāʾ marbūṭaŧ, if the first word of iḏāfaŧ (mostly relevant for transliteration):
      • `_amma:t+u Ba.gd_ada, but:
      • al-`_amma:tu f_i Ba.gd_ada
    • transliterating tanwīn:
      • .n
        • ?u.n
        • ?i.n
        • ?a.n
    • silent wāw and alif:
      • .w (Amr?u.n.w, for <span=”arabic”>عَمْرٌو</span>)
      • .a (wa-fa`al_u.a, for <span=”arabic”>وَفَعَلُوا</span>)

    Running the converter

    • (Python 3.xx must be installed on the machine)
    • clone git repository @ github.com/maximromanov/ArabicBetacode
    • save texts that must be transliterated (i.e., the text is in English, but has some Arabic terms that must be transliterated) into ./to_translit/ (follow the format given in the example file).
    • save texts that must be fully transliterated or/and converted into Arabic script (i.e., the entire texts is in Arabic) into ./to_arabic/ (follow the format given in the example file).
    • run the script _generateBetaCode.py (in Mac terminal: python3 _generateBetaCode.py; on Windows: double-click on the script should work).
    • converted texts (in all available modes of conversion) will be appended to the file.
    • if you need to make any changes, edit your initial betaCode text and run the script again, converted results will be replaced with relevant updated versions.

    Examples

    betaCode Example

    NB: These are examples of converting betaCode to full transliteration and Arabic script. The very last paragraph showcases conversion of hamzaŧ in different positions.

    q_ala ‘ab_u Mas`_ud?i.n :: ‘an_a qad sami`tu h~a_d_a min ras_ul?i All~ah?i ( .sl`m )

    .hadda_ta-n_a `Amr?u.w bn?u R_afi`?i.n , .hadda_ta-n_a `Abd?u All~ah?i bn?u al-Mub_arak?i , `an Mu.hammad?i bn?i ‘Is.h_aq?a , `an Mu.hammad?i bn?i ^Ga`far?i.n , `an `Ubayd?i All~ah?i bn?i `Abd?i All~ah?i bn?i `Umar?a , `an ‘Ab_i-hi , `an?i al-Nabiyy?i ( .sl`m ) na.hwa-hu

    ‘a_hbara-n_a Qutayba:t?u q_ala , .hadda_ta-n_a Sufy_an?u , `an Ya.hy/a bn?i Sa`_id?i.n , `an ‘Ab_i Bakr?i bn?i Mu.hammad?i.n , `an `Umar?a bn?i `Abd?i al-`Az_iz?i , `an ‘Ab_i Bakr?i bn?i `Abd?i al-Ra.hm~an?i bn?i al-.H_ari_t?i bn?i Hi^s_am?i.n , `an ‘Ab_i Hurayra:t?a mi_tla-hu

    Ta.hw_il?u al-hamza:t?i ( kalim_at?u.n mufrada:t?u.n )

    ‘amr?u.n ‘uns?u.n ‘ins?u.n ‘_im_an?u.n ‘_aya:t?u.n ‘_amana mas’ala:t?u.n sa’ala ra’s?u.n qur’_an?u.n ta’_amara _di’b?u.n as’ila:t?u.n q_ari’i-hi su’l?u.n mas’_ul?u.n tak_afu’u-hu su’ila q_ari’i-hi _di’_ab?u.n ra’_is?u.n bu’isa ru’_uf?u.n ra’_uf?u.n su’_al?u.n mu’arri_h?u.n abn_a’a-hu abn_a’u-hu abn_a’i-hi ^say’?a.n _ha.t_i’a:t?u.n .daw’u-hu .d_u’u-hu .daw’a-hu .daw’i-hi mur_u’a:t?u.n ‘abn_a’i-hi bar_i’u-hu s_u’ila f_il?u.n f_ann?u.n f_unn?u.n s_a’ala fu’_ad?u.n ^surak_a’u-hu ri’_asa:t?u.n tahni’a:t?u.n daf_a’a:t?u.n .taff_a’a:t?u.n ta’r_i_h?u.n fa’r?u.n ^say’?u.n ^say’?i.n ^say’?a.n .daw’?u.n .daw’?i.n .daw’?a.n juz’?u.n juz’?i.n juz’?a.n mabda’?u.n mabda’?i.n mabda’?a.n naba’a q_ari’?u.n tak_afu’?u.n tak_afu’?i.n tak_afu’?a.n abn_a’u abn_a’i abn_a’a jar_i’?u.n maqr_u’?u.n .daw’?u.n ^say’?u.n juz’?u.n `ulam_a’u al-`ulam_a’i al-`ulam_a’a `Amr?u.n.w wa-fa`al_u.a

    betaCode converted into one-to-one translit

    ḳāla ʾabū Masʿūdỉȵ :: ʾanā ḳad samiʿtu hãḏā min rasūlỉ Allãhỉ ( ṣlʿm )

    ḥaddaṯa-nā ʿAmrủů bnủ Rāfiʿỉȵ , ḥaddaṯa-nā ʿAbdủ Allãhỉ bnủ al-Mubārakỉ , ʿan Muḥammadỉ bnỉ ʾIsḥāḳả , ʿan Muḥammadỉ bnỉ Ǧaʿfarỉȵ , ʿan ʿUbaydỉ Allãhỉ bnỉ ʿAbdỉ Allãhỉ bnỉ ʿUmarả , ʿan ʾAbī-hi , ʿanỉ al-Nabiyyỉ ( ṣlʿm ) naḥwa-hu

    ʾaḫbara-nā Ḳutaybaŧủ ḳāla , ḥaddaṯa-nā Sufyānủ , ʿan Yaḥyá bnỉ Saʿīdỉȵ , ʿan ʾAbī Bakrỉ bnỉ Muḥammadỉȵ , ʿan ʿUmarả bnỉ ʿAbdỉ al-ʿAzīzỉ , ʿan ʾAbī Bakrỉ bnỉ ʿAbdỉ al-Raḥmãnỉ bnỉ al-Ḥāriṯỉ bnỉ Hišāmỉȵ , ʿan ʾAbī Hurayraŧả miṯla-hu

    Taḥwīlủ al-hamzaŧỉ ( kalimātủȵ mufradaŧủȵ )

    ʾamrủȵ ʾunsủȵ ʾinsủȵ ʾīmānủȵ ʾāyaŧủȵ ʾāmana masʾalaŧủȵ saʾala raʾsủȵ ḳurʾānủȵ taʾāmara ḏiʾbủȵ asʾilaŧủȵ ḳāriʾi-hi suʾlủȵ masʾūlủȵ takāfuʾu-hu suʾila ḳāriʾi-hi ḏiʾābủȵ raʾīsủȵ buʾisa ruʾūfủȵ raʾūfủȵ suʾālủȵ muʾarriḫủȵ abnāʾa-hu abnāʾu-hu abnāʾi-hi šayʾảȵ ḫaṭīʾaŧủȵ ḍawʾu-hu ḍūʾu-hu ḍawʾa-hu ḍawʾi-hi murūʾaŧủȵ ʾabnāʾi-hi barīʾu-hu sūʾila fīlủȵ fānnủȵ fūnnủȵ sāʾala fuʾādủȵ šurakāʾu-hu riʾāsaŧủȵ tahniʾaŧủȵ dafāʾaŧủȵ ṭaffāʾaŧủȵ taʾrīḫủȵ faʾrủȵ šayʾủȵ šayʾỉȵ šayʾảȵ ḍawʾủȵ ḍawʾỉȵ ḍawʾảȵ ǧuzʾủȵ ǧuzʾỉȵ ǧuzʾảȵ mabdaʾủȵ mabdaʾỉȵ mabdaʾảȵ nabaʾa ḳāriʾủȵ takāfuʾủȵ takāfuʾỉȵ takāfuʾảȵ abnāʾu abnāʾi abnāʾa ǧarīʾủȵ maḳrūʾủȵ ḍawʾủȵ šayʾủȵ ǧuzʾủȵ ʿulamāʾu al-ʿulamāʾi al-ʿulamāʾa ʿAmrủȵů wa-faʿalūå

    betaCode converted into Arabic script

    قَالَ أَبُو مَسْعُودٍ :: أَنَا قَدْ سَمِعْتُ هٰذَا مِنْ رَسُولِ الـلّٰـهِ ( صْلْعْمْ )

    حَدَّثَنَا عَمْرُو بْنُ رَافِعٍ ، حَدَّثَنَا عَبْدُ الـلّٰـهِ بْنُ الْمُبَارَكِ ، عَنْ مُحَمَّدِ بْنِ إِسْحَاقَ ، عَنْ مُحَمَّدِ بْنِ جَعْفَرٍ ، عَنْ عُبَيْدِ الـلّٰـهِ بْنِ عَبْدِ الـلّٰـهِ بْنِ عُمَرَ ، عَنْ أَبِيهِ ، عَنِ النَّبِيِّ ( صْلْعْمْ ) نَحْوَهُ

    أَخْبَرَنَا قُتَيْبَةُ قَالَ ، حَدَّثَنَا سُفْيَانُ ، عَنْ يَحْيٰى بْنِ سَعِيدٍ ، عَنْ أَبِي بَكْرِ بْنِ مُحَمَّدٍ ، عَنْ عُمَرَ بْنِ عَبْدِ الْعَزِيزِ ، عَنْ أَبِي بَكْرِ بْنِ عَبْدِ الرَّحْمٰنِ بْنِ الْحَارِثِ بْنِ هِشَامٍ ، عَنْ أَبِي هُرَيْرَةَ مِثْلَهُ

    تَحْوِيلُ الْهَمْزَةِ ( كَلِمَاتٌ مُفْرَدَةٌ )

    أَمْرٌ أُنْسٌ إِنْسٌ إِيمَانٌ آيَةٌ آمَنَ مَسْأَلَةٌ سَأَلَ رَأْسٌ قُرْآنٌ تَآمَرَ ذِئْبٌ أَسْئِلَةٌ قَارِئِهِ سُؤْلٌ مَسْؤُولٌ تَكَافُؤُهُ سُئِلَ قَارِئِهِ ذِئَابٌ رَئِيسٌ بُئِسَ رُؤُوفٌ رَؤُوفٌ سُؤَالٌ مُؤَرِّخٌ أَبْنَاءَهُ أَبْناؤُهُ أَبْنائِهِ شَيْئًا خَطِيئَةٌ ضَوْءُهُ ضُوؤُهُ ضَوْءَهُ ضَوْئِهِ مُرُوءَةٌ أَبْنائِهِ بَرِيؤُهُ سُوئِلَ فِيلٌ فَانٌّ فُونٌّ سَاءَلَ فُؤَادٌ شُرَكاؤُهُ رِئَاسَةٌ تَهْنِئَةٌ دَفَاءَةٌ طَفّاءَةٌ تَأْرِيخٌ فَأْرٌ شَيْءٌ شَيْءٍ شَيْئًا ضَوْءٌ ضَوْءٍ ضَوْءًا جُزْءٌ جُزْءٍ جُزْءًا مَبْدَأٌ مَبْدَأٍ مَبْدَأً نَبَأَ قَارِئٌ تَكَافُؤٌ تَكَافُؤٍ تَكَافُؤًا أَبْناءُ أَبْناءِ أَبْناءَ جَريءٌ مَقْروءٌ ضَوْءٌ شَيْءٌ جُزْءٌ عُلَماءُ الْعُلَماءِ الْعُلَماءَ عَمْرٌو وَفَعَلُوا

    betaCode into Translit

    betaCode in English text

    NB: This is an example of the English text with terms, names and toponyms given in betaCode and automatically converted into different transliteration flavors (exerpts are from Brill’s Encyclopaedia of Islam).

    Dima^s.k, Dima^s.k al-^S_am or simply al-^S_am , (Lat. Damascus, Fr. Damas) is the largest city of Syria. It is situated … very much at the same latitude as Ba.gd_ad and F_as, at an altitude of nearly 700 metres, on the edge of the desert at the foot of ^Gabal .K_asiy_un.

    al-_Dahab_i, ^Sams al-D_in Ab_u `Abd All~ah Mu.hammad b. `U_tm_an b. .K_aym_a.z b. `Abd All~ah al-Turkum_an_i al-F_ari.k_i al-Dima^s.k_i al-^S_afi`_i, an Arab historian and theologian, was born at Damascus or at Mayy_afari.k_in on 1 or 3 Rab_i` II (according to al-Kutub_i, in Rab_i` I) 673/5 or 7 October 1274, and died at Damascus, according to al-Subk_i and al-Suy_u.t_i, in the night of Sunday-Monday on 3 _D_u al-.Ka`da:t 748/4 February 1348, or, according to A.hmad b. `Iy_as, in 753/1352-3. He was buried at the B_ab al-.Sa.g_ir.

    betaCode converted into one-to-one translit

    Dimašḳ, Dimašḳ al-Šām or simply al-Šām , (Lat. Damascus, Fr. Damas) is the largest city of Syria. It is situated … very much at the same latitude as Baġdād and Fās, at an altitude of nearly 700 metres, on the edge of the desert at the foot of Ǧabal Ḳāsiyūn.

    al-Ḏahabī, Šams al-Dīn Abū ʿAbd Allãh Muḥammad b. ʿUṯmān b. Ḳāymāẓ b. ʿAbd Allãh al-Turkumānī al-Fāriḳī al-Dimašḳī al-Šāfiʿī, an Arab historian and theologian, was born at Damascus or at Mayyāfariḳīn on 1 or 3 Rabīʿ II (according to al-Kutubī, in Rabīʿ I) 673/5 or 7 October 1274, and died at Damascus, according to al-Subkī and al-Suyūṭī, in the night of Sunday-Monday on 3 Ḏū al-Ḳaʿdaŧ 748/4 February 1348, or, according to Aḥmad b. ʿIyās, in 753/1352-3. He was buried at the Bāb al-Ṣaġīr.

    betaCode converted into the Library of Congress scheme

    Dimashq, Dimashq al-Shām or simply al-Shām , (Lat. Damascus, Fr. Damas) is the largest city of Syria. It is situated … very much at the same latitude as Baghdād and Fās, at an altitude of nearly 700 metres, on the edge of the desert at the foot of Jabal Qāsiyūn.

    al-Dhahabī, Shams al-Dīn Abū ʿAbd Allāh Muḥammad b. ʿUthmān b. Qāymāẓ b. ʿAbd Allāh al-Turkumānī al-Fāriqī al-Dimashqī al-Shāfiʿī, an Arab historian and theologian, was born at Damascus or at Mayyāfariqīn on 1 or 3 Rabīʿ II (according to al-Kutubī, in Rabīʿ I) 673/5 or 7 October 1274, and died at Damascus, according to al-Subkī and al-Suyūṭī, in the night of Sunday-Monday on 3 Dhū al-Qaʿda 748/4 February 1348, or, according to Aḥmad b. ʿIyās, in 753/1352-3. He was buried at the Bāb al-Ṣaghīr.

    betaCode converted into a searcheable string (diacritics removed)

    Dimashq, Dimashq al-Sham or simply al-Sham , (Lat. Damascus, Fr. Damas) is the largest city of Syria. It is situated … very much at the same latitude as Baghdad and Fas, at an altitude of nearly 700 metres, on the edge of the desert at the foot of Jabal Qasiyun.

    al-Dhahabi, Shams al-Din Abu Abd Allah Muhammad b. Uthman b. Qaymaz b. Abd Allah al-Turkumani al-Fariqi al-Dimashqi al-Shafii, an Arab historian and theologian, was born at Damascus or at Mayyafariqin on 1 or 3 Rabi II (according to al-Kutubi, in Rabi I) 673/5 or 7 October 1274, and died at Damascus, according to al-Subki and al-Suyuti, in the night of Sunday-Monday on 3 Dhu al-Qada 748/4 February 1348, or, according to Ahmad b. Iyas, in 753/1352-3. He was buried at the Bab al-Saghir.


    0 0

    “Envy is not a very good thing. Yet envy is precisely what an early Islamicist feels when he reads Roger Bagnall and Bruce Frier’s The Demography of Roman Egypt.” 1 These words stuck in my head since the very moment I read them and over the past two years of working among and with the classicists my classics envy has been growing—on top of 300 original census declarations that were at the disposal of of the above mentioned scholars, there are way too many things to envy, especially when it comes to all things digital.

    The Pleiades Gazetteer is a particularly interesting case: with almost 35,000 places, it offers several well-populated categories of geographical objects. The categories include settlements, forts, temples, villas, stations, [amphi]theaters, churches, bridges, baths, cemetaries, plazas, archs. What makes it even more interesting is that most of these objects have chronological markers, i.e. they belong to one or more of the following periods: archaic (750–550BC), classical (550–330BC), hellenistic-republican (330–30BC), roman (30BC–300CE), late-antique (300–640CE).

    This data offers a opportunity for an interesting digital exersize with historical data. I assigned it to my students as a part of introduction to R (within my “Introduction to Text Mining for the Students of Humanities”, Tufts University, Spring 2015). The task was to explore the Pleiades data set, find out what is what and what can be done with it. The goal was to discover that 1) geographical objects are categorized, and that 2) they also have chronological markers, which can be used 3) to maps the geography of the Greco-Roman world over time.

    The map of forts turned out to be particularly interesting.

    Below is the code and some of the resulting visualizations.

    # R# You might need to install the necessary packages, which you can do by running the following lines# NB: uncomment them first# install.packages("ggplot2")# install.packages("maps")# install.packages("mapdata")# install.packages("rgeos")# install.packages("maptools")# install.packages("mapproj")# install.packages("PBSmapping")# install.packages("data.table")library(ggplot2)library(maps)library(mapdata)library(rgeos)library(maptools)library(mapproj)library(PBSmapping)library(data.table)xlim=c(-12,55);ylim=c(20,60)worldmap=map_data("world")setnames(worldmap,c("X","Y","PID","POS","region","subregion"))worldmap=clipPolys(worldmap,xlim=xlim,ylim=ylim,keepExtra=TRUE)# setwd("") # set your working folder heredataFolder=""# ideally, full path to the foldercsvName=paste0(dataFolder,"pleiades-locations-20150316.csv")locsRaw=read.csv(csvName,stringsAsFactors=F,header=T,sep=',')# url: http://atlantides.org/downloads/pleiades/dumps/# ---: download the latest csv, unzip periods=rbind(c("archaic","750-550BC"),c("classical","550-330BC"),c("hellenistic-republican","330-30BC"),c("roman","30BC-300CE"),c("late-antique","300-640CE"))features=rbind(c("","locations"),c("settlement","settlements"),c("fort","forts"),c("temple","temples"),c("villa","villas"),c("station","stations"),c("theatre","theatres"),c("amphitheatre","amphitheatres"),c("church","churches"),c("bridge","bridges"),c("bath","baths"),c("cemetery","cemeteries"),c("plaza","plazas"),c("arch","archs"))land="grey";water="grey80";bgColor="grey80"locPleiades=geom_point(data=locsRaw,color="grey70",alpha=.75,size=1,aes(y=reprLat,x=reprLong))for(iin1:nrow(features)){locs=locsRaw[with(locsRaw,grepl(features[i,1],featureType)),]for(iiin1:nrow(periods)){locPer=locs[with(locs,grepl(periods[ii,1],timePeriodsKeys)),]locPer=geom_point(data=locPer,color="red",alpha=.75,size=1,aes(y=reprLat,x=reprLong))dataLabel="Data: Pleiades Project"fName=paste0(dataFolder,"Pleiades_",features[i,2],sprintf("%02d",ii),".png")header=paste0(features[i,2]," in the ",periods[ii,1]," period (",periods[ii,2],")")p=ggplot()+coord_map(xlim=xlim,ylim=ylim)+geom_polygon(data=worldmap,aes(X,Y,group=PID),size=0.1,colour=land,fill=water,alpha=1)+annotate("text",x=-11,y=21,hjust=0,label=dataLabel,size=3,color="grey40")+annotate("text",x=54,y=59,hjust=1,label=header,size=5,color="grey40")+locPleiades+locPer+labs(y="",x="")+theme_grey()ggsave(file=fName,plot=p,dpi=600,width=7,height=6)}}

    Using Image Magick to animate maps

    The fastest and easiest way to animate the results is to use ImageMagick, a free command-line utility. The following command will take all .png files whose names begin with Pleiades_Settle and convert them into an animated GIF file Pleiades_Settlements.gif, which will play continuously (-loop 0), with each frame downsized (-resize 1200x900) and paused for .75 of a second (-delay 75).

    convert -resize 1200x900 -delay 75 -loop 0 Pleiades_Settle*.png Pleiades_Settlements.gif

    Chronological Cartograms

    All Locations

    Settlements

    Forts

    All categories

    Amphitheaters,arches,baths,bridges,cemeteries,churches,forts,locations,plazas,settlements,stations,temples,theaters,villas.

    Footnotes

    1. al-Qādī, Wadād. “Population Census and Land Surveys under the Umayyads (41-132/661-750).” Der Islam 83, no. 2 (2006), p. 341. 


    0 0

    On October 16, 2015, the Digital Islamic Humanities Program at Brown University held its third annual scholarly gathering, a symposium on the subject “Distant Reading & the Islamic Archive.”


    0 0

    TEI XML has long become the standard for tagging humanistic texts for research purposes. It is the standard in most digital libraries (including the Perseus Digital Library). Having texts in a TEI XML format that conforms to the standards of a long-standing library allows one to take advantage of libraries’ infrastructure and analytical tools that have been developed since the appearance of TEI XML. Converting texts into XML, however, is a rather long and complicated process.

    Texts in Arabic make things even more complicated. Right-to-left (RTL) and left-to-right (LTR) text in one file is one the major challenges. Since the cursor changes the direction of its movement when crossing the boundary between RTL and LTR text, it is difficult to place the cursor properly, and one often ends up changing a wrong part of the text. The direction of paired characters is visually confusing, and it is often next to impossible to say whether a given angle bracket—perhaps the most important XML character—is an opening character or a closing one. Moreover, the shapes of Arabic letters in a text file are dynamically changing as one types or edits Arabic text, and many text editors do not handle this properly (particularly on Mac). In addition to these technical challenges, there are too many Arabic texts to convert—and most of them are multivolume titles—and too few people who have both training and willingness to do that.

    In the beginning of my digital research I have considered TEI XML as a working format, but I had to give up on this option, since converting a 50-volume book (~3,4 million words) would have taken forever. After reviewing existing approaches, I came up with a rather simple tagging system that allowed me to create a structured, machine-readable text, without sacrificing years of my life. In many ways, this system was inspired by markdown—“a text-to-HTML conversion tool … that allows [one] to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML).”

    The main goal of mARkdown is to provide a simple system for tagging structural elements in Arabic texts that would facilitate algorithmic analysis in the same way as more complex TEI XML does. In principle, mARkdown does not require any special editor, but my current workflow relies on EditPad Pro, which supports right-to-left languages, Unicode, and large files. However, it is the support of custom highlighting and navigation schemes that makes this text editor particularly convenient for mARkdown.

    Since I have been using my mARkdown for my own research purposes, it has not yet been developed into an easily reusable system. This is my first attempt to provide a detailed description and explain how it can be used. I expect that mARkdown will undergo some minor changes in the upcoming months. The most recent description can be accessed from the main menu above.

    mARkdown in EditPad Pro activated with the “magic value” in test_textFile

    0 0

    While looking for a way to identify all biographical collections and chronicles (and, by extension, all other texts that offer data for time-series analysis) in a collection of 0ver 10,000 texts, it occurred to me that all these texts share the same common feature—they are teeming with dates. So, what if we try to identify such texts computationally?! Not only will this help us to find all relevant titles in the sea of text—without overlooking or missing anything!—we, arguably, can get an insight into the chronological coverage of each of those titles, the chronological focus of individual historians, the chronological coverage of the entire collection of historical texts, and identify texts that focus on particular periods. The blogpost begins with an overview of several digital collections and then explains the methodology of the experiment. Appendices offer one to explore the chronological coverage of about 1,000 individual texts as well as the coverage of particular periods (here, hijri centuries—i.e., which texts focus on particular periods).


    Introduction

    Digital collections of classical Arabic texts have mushroomed over the past decade and a half. The three major libraries—al-Ǧāmiʿ al-kabīr (HDD), Shamela.ws, ShiaOnlineLibrary.com—include over 10,000 titles. There is probably another dozen collections that offer texts in hundreds and thousands (for example, Alwaraq.net, Waqfeya.com, NoorLib.ir, GhBook.ir, Lib.Eshia.ir, Library.Tebyan.net, HathiTrust.org, Archive.org).

    ShiaOnlineLibrary.comShamela.wsal-Ǧāmiʿ al-kabīr118501,689365ShiaOnlineLibrary.com: 1,810 titlesShamela.ws: 5,999 titlesal-Ǧāmiʿ al-kabīr: 2,364 titlesUNIQUE: 7,895 titles (~1,1 billion words)
    Overlap among collections. There is significant overlap among available digital collections. Thus, while their cumulative volume may run into tens of thousands, the count of unique titles—excluding the exact copies and texts based on different editions—is significantly lower. Additionally, it is very difficult to identify duplicates among the collections. The Venn diagram above shows the overlap—over 2,000 titles—among the three major collections (the count it still work in progress). NB The diagram generated with Ben Frederickson’s code.

    The number of these collections appears to be growing and their content expanding. This new research environment offers scholars an opportunity to check whether a particular text is included into in a certain collection, to browse and read it—often in a page-by-page manner—and to search for particular bits of information. These collections work well for looking for something that we know or expect to find—a book, a person, an event, a term. What we cannot do is to look into how books are related, how they overlap and complement each other; how each individual fits among his contemporaries as well as his predecessors and successors; how different historical events are intertwined; how terms, notions and concepts are related to each other and evolve across time and space. Yet, having full texts of our sources at our disposal, we can definitely go beyond simplistic linear searches. By asking a series of interconnected questions—and relying on digital methods of text analysis—we can move toward a new understanding of the entire Arabic written tradition (starting, of course, with what is digitally available in one form or another).

    The question of chronology is one of such foundational questions. What I offer in this experiment is to explore the content of three such collections in order to understand better the chronological coverage of each collection, each author, and each book. In order to get insights into these issues we can turn to different kinds of data. To get a perspective on the scope of each collection we shall start with looking into descriptions of books and their authors. More specifically—into when authors died.

    Metadata

    While metadata in most collections is not complete, it can still be quite useful. Major digital collections—al-Ǧāmiʿ al-kabīr (HDD), Shamela.ws, and ShiaOnlineLibrary.com—display the same clear trend: strong emphasis on the period from the 3rd–6th centuries AH (912–1203 CE), with an extra peak in the 8th century (1300–1397 CE), a steady decline during the 9th–12th centuries AH (1494–1785 CE), a slow recovery during the 13th century AH (1785–1882 CE), and skyrocketing in the 14th century AH (1882–1979 CE).

    Note on graphs. Data points of each graphed line show frequencies for periods of time that end at that point. For example, on the graph below that shows distribution of data by 100 lunar years (titles in al-Ǧāmiʿ al-kabīr), the value for 300/912 CE is 280, which means that there are 280 titles written by authors who died during 200–300 AH / 815–912 CE. A “step-before” type of graph displays such data most appropriately, but it is not suitable for comparative graphs, since there is too much overlap among the lines which makes the entire graph unreadable. Data on the most recent authors (after 1400/1979 CE) is excluded from the graphs, since it tends to overshadow earlier periods.

    al-Ǧāmiʿ al-kabīr (HDD) has the most complete chronological metadata on its authors.
    Shamela.ws (online). Almost half of its metadata do not have chronological metadata.
    ShiaOnlineLibrary.com (online). The collection has a rather complete chronological metadata. Almost 1/3 of all titles are books of modern Šīʿīte scholars (excluded from the graph so that they do not overshadow earlier periods).
    Alwaraq.com (online) has the most incomplete metadata, but it still suggests the same trend.

    The developers of these collections were most interested in the early Islamic period (roughly the first half of the first Islamic millennium). According to the data of such sources as the Hadiyyaŧ al-ʿārifīn by Ismāʿīl Bāšā al-Baġdādī (d. 1338/1919 CE), a bibliographical collection that builds upon the famous Kašf al-ẓunūn of Ḥāǧī Ḫalīfaŧ (d. 1067/1656 CE), and Ḫizānaŧ al-turāṯ, a Saudi catalog of manuscripts (al-Riyāḍ: Šarikaŧ al-ʿArīs lil-Kumbiyūtir, 2007), the number of contributors to the Islamic written treasury is continuously growing at least up until the beginning of the 13th century AH.

    The “growth” of authors, according to the data from the Hadiyyaŧ al-ʿārifīn and the Ḫizānaŧ al-turāṯ.

    Ḫizānaŧ al-turāṯ is a Saudi catalog of manuscripts that was first published on a CD (al-Riyāḍ: Šarikaŧ al-ʿArīs lil-Kumbiyūtir, 2007); currently its full text is included into Shamela.ws. The catalog includes over 160,000 records, but unfortunately suffers from a number of problems, such as inconsistency of typing conventions, duplicate records, selective coverage of different manuscript collections (for example, only about 1,000 Arabic manuscripts from St.Petersburg, Russia are covered, while St.Petersburg academic institutions house at least 11,000 Arabic manuscripts).

    Even though existing digital collections often awe us by their volume, the comparative graphs below shows that they cover only a fraction of the Arabic written tradition—even by comparison with an early 20th-century bibliography, which itself is hardly complete in its coverage. Additionally, the graphs also clearly highlights the fact that the chronological coverage of these collections is skewed heavily in favor of the earlier period of Islamic history.

    Chronological distribution of book titles in the Hadiyyaŧ al-ʿārifīn, Shamela.ws, al-Ǧāmiʿ al-kabīr (HDD), and ShiaOnlineLibrary.com.
    Chronological distribution of book titles in the Hadiyyaŧ al-ʿārifīn, Shamela.ws, al-Ǧāmiʿ al-kabīr (HDD), and ShiaOnlineLibrary.com.

    A note on the Hadiyyaŧ al-ʿārifīn. The decline of both graphs after 1200/1785 CE indicates unavailability of bibliographical information to the author more than anything else. The geographical coverage of the collection starts shrinking roughly at the same period. It should be noted that most chronological datasets exhibit a similar trend. For example, the trend can be observed in al-Ḏahabī’s own Ḏayl to his Taʾrīḫ al-islām, where the number of biographies drops dramatically; one can equally see the same trend in Brill’s Index Islamicus and Harvard Open Metadata (on 12 million books). The only difference is that the lag gets shorter as we get closer to our time—for premodern Arabic sources this lag is 100 to 150 years; in modern datasets—10 to 20 years.

    Another way to evaluate chronological coverage is too explore the actual texts. Ideally, the number of discrete units of information—such as, for example, biographies and events—by periods should show the distribution of chronological emphasis of a particular source. Furthermore, the summary of such data from all [available] titles written by a specific author should indicate this author’s interest in specific periods. (The interpretation of such “interest” is a different subject altogether. For example, the fact that the Hadiyyaŧ al-ʿārifīn has more information on the 11th and the 12th centuries AH (1591–1785 CE), may indicate either Ismāʿīl Bāšā al-Baġdādī’s interest in this particular period, or the availability of information for this period, or the genuine growth in numbers of people contributing to the Islamic written treasury.)

    Date Statements

    Almost none of the texts, however, are tagged in a manner that would allow to do such a detailed evaluation. Yet, it is possible to analyze date statements in each texts and offer an evaluation of their chronological coverage based on the frequencies of references to different periods. The consistency of date statements in Arabic texts—essentially, a word for “year” (ʿām or sanaŧ) followed by either digits or spelled-out numbers—makes it possible to represent this pattern with a regular expression, a special text string for describing a search pattern (see Figure below). This regular expression can be worked into a script, with which one can check available texts. It should be noted, of course, that this approach is tuned to analyze hiǧrī dates, since other dating systems are used only infrequently.

    Words sanaŧ and ʿām in the histories of Islam. Overall, the word sanaŧ is used most frequently in date statements: of about 1,362,000 date statements from across 10,000 texts only 2.9% of statements start with the word ʿām (~40,000), while 97.1% begin with the word sanaŧ (~1,322,000). Closer look also reveals that the word ʿām is favored in texts written in the 20th century; with regards to premodern texts, it can be said that authors from the western part of the Islamic world—al-Andalus and al-Maġrib—tend to use it more frequently, than their eastern counterparts.

    Note: Adding “in,” into the mix changes the picture into: of about 1,670,000 statements, 79.2% start with sanaŧ (~1,322,000), 18.5% with (~308,000), and 2.4% with ʿām (~40,000). The problem is that even a quick look at the ngrams of -statements—the words that immediately follow each -statement—shows that more than a half of these statements are quantitative phrase of different kind (for example, fī arbaʿ mujalladāt). For this reason, -statements are excluded from the analysis.

    [Top] A regular expression for capturing year statements in premodern Arabic sources. You can copy it and test it on some text. [Bottom] The image demonstrates this regular expression highlighting year statements (bright green) in the Taʾrīḫ al-islām of al-Ḏahabī (d. 748/1347 CE). Program used: EditPad Pro.

    Such an approach is not without its problems, of course, but it may serve well as an exploratory technique. The results of the experiment are intriguing in a number of ways, even though not entirely consistent. The most important outcome is that it allowed to discover that the collection of 10,000 texts contains only about 785 texts with more than 100 date statements per text (and since the included collections overlap, the number of unique titles is even smaller). Needless to say, that working with 785 texts is significantly easier than working with 10,000 titles. Additionally, frequencies of date statements for each text offer an opportunity to focus one’s efforts on texts that contain most data suitable for time-series analysis.

    Choronolgical coverage. The graphs show the chronological coverage for the same text generated with two different approaches: while the orange dotted line represents the ideal situation—data collected through the manual tagging of the entire source, the blue solid line represents the only realistic situation—data extracted computationally. While the absolute results differ, the relative distribution is very similar and emphasizes the same periods. On the problem of the 1st century AH (622–718 CE) see below.

    The graph above shows two different representations of the chronological coverage of the Hadiyyaŧ al-ʿārifīn by Ismāʿīl Bāšā al-Baġdādī (d. 1338/1919 CE), a bibliographical collection that builds upon the famous Kašf al-ẓunūn of Ḥāǧī Ḫalīfaŧ (d. 1067/1656 CE). The blue line shows the frequencies of date statements by periods (binned into 50 year periods)—strongly suggesting more emphasis on the 11th an 12th centuries AH (1591–1785 CE). The orange dotted line shows the distribution of biobibliographical records on about 8,800 authors—this actual distribution of discrete information units in the source emphasizes the same period of the 11th and 12th centuries. The similarity in the patterns of distribution shows that reliance on computationally extracted date statements is a viable alternative.

    The 1st Century Problem

    Unfortunately, many texts suffer from what can be characterized as “the 1st century problem”: authors often drop hundreds from date statements (authors from the second millennium also tend to drop thousands), which leads to a very high number of date statements referring—at the face value—to the 1 st century AH (622–718 CE). As a result, the 1st century often gets inflated, overshadowing other periods. The graph below illustrates this issue.

    Since authors often drop hundreds from their date statements, the 1st century AH gets overinflated. As the title suggests, al-Saḫāwī’s (d. 902/1496 CE) al-Ḍawʾ al-lāmiʿ li-ahl al-ḳarn al-tāsiʿ focuses on the 9th century AH (1397–1494 CE), but—as the graph above shows—the number of date statements referring to the 8th (1300–1397 CE) and 9th (1397–1494 CE) centuries is significantly smaller than of those referring to the 1st century (notice the gap in between!). It is clear that al-Saḫāwī is dropping hundreds from his date statements. The problem is that some of those statements may refer to the 8th century, while some others to the 9th, so moving them all to the 9th century is hardly a solution.

    The problem may be resolved through the sequential analysis of date statements in texts. Authors are not likely to drop hundreds from their statements without letting their readers know what century they are talking about. In other words, an incomplete date statement must be preceded by a complete one. Thus, one can check if there are other date statements—and if there is, the incomplete date can be fit into the period of the preceding statement.

    The actual implemented algorithm grabs a 100-word chunk before a 1st-century date statement and checks if there are other date statements in that chunk. The procedure is repeated up to five times, that is checking up to 500 words—an equivalent of 1 to 3 printed pages—before the date statement in question, until either the text limit is reached or a date statement found. If a date statement is found, its century gets applied to the starting date statement that we treated as incomplete. In other words, if we start with “the year 65”, and we find “the year 530” preceding it, we change the first date into “the year 565” (1169 CE). If the preceding date is also from the 1st century, the starting date remains unchanged; the date also remains unchanged, if no other date statements have been found. Additionally, the algorithm runs in two different ways—in the first case, it does not build on updated date statements (Lines B); while in the second, it does, extrapolating from corrected date statements (Line C). The graph below shows the results.

    The graph shows new results for al-Saḫāwī’s (d. 902/1496 CE) al-Ḍawʾ al-lāmiʿ li-ahl al-ḳarn al-tāsiʿ: A (solid blue line) shows unmodified date statements (as in the previous graph); B (dotted orange line) shows the results of the first run of the algorithm—over 2,800 statements were updated, but there is still a lot of dates for the 1st century; C (dashed green line) shows the results of the second run of the algorithm, which builds on the updated dates—almost 12,000 date statements were redistributed, now clearly showing that the book is about 9th century.
    Note: a6675 is the identifier of a particular version of the text—title #6675 from al-Maktabaŧ al-Šāmilaŧ; the same title from a different collection will have a different identifier.

    The question is, of course, how reliable such projections are. In order to check this we need to compare algorithmically produced results with manually disambiguated data. The graphs below show such comparisons for four different sources: A (orange dotted) shows the initial results of computational date statements collection; B (green dashed)—modified dates without extrapolation; C (red dashed)—modified results with extrapolation; and, finally, D (blue solid)—shows manually disambiguated 1st-century date statements.

    al-Wafayāt al-aʿyān of Ibn Ḫallikān (d. 681/1282 CE)

    Results for Ibn Ḫallikān’s al-Wafayāt al-aʿyān are very good—algorithmically modified dates are very close to manually disambiguated. Results of Algorithm B—modified results without extrapolation—are slightly closer to the benchmark (line D) than the results of Algorithm C. Yet, both are somewhat “overfitting” 1st-century dates. Good news: algorithmic lines B and C lead to the same conclusion as the benchmark Line D—Ibn Ḫallikān covers the period of 450–650 AH / 1058–1252 CE most thoroughly.

    al-Kāmil fī-l-taʾrīḫ of Ibn Aṯīr (d. 630/1232 CE)

    Results for Ibn Aṯīr’s al-Kāmil fī-l-taʾrīḫ are less precise: both algorithms overfitted 1st-century dates, inflating other centuries, if compared to manually disambiguated data (D). The peaks of distribution—the shape of the curve—are much closer to the benchmark than the preprocessed results (A), but computational analysis suggests that Ibn Aṯīr focuses more on the later period, while (according to manually disambiguated data) his attention is spread more evenly.

    Ṭabaḳāt al-šāfiʿiyyaŧ of Ibn Ḳāḍī Šuhbaŧ (d. 851/1447 CE)

    Results for the Ṭabaḳāt al-šāfiʿiyyaŧ of Ibn Ḳāḍī Šuhbaŧ are not ideal, but still much better than the initial results. Extending the check range from 500 words to 1,000 gets the graph—line C in particular—much closer to the benchmark (click on the image to see the graph based on the extended range of 1,000 words). The problem, however, is that for other sources 1,000-word range does not generate better results.

    Some general observations

    We are clearly not getting 100% match with the benchmark, but that is not to be expected anyway—none of the exploratory computational methods work that way. Our model does not take into account the stylistic differences among authors. While the ballpark of date statements do fall into the proposed pattern there are occasionally slight variations that are peculiar to particular authors. Some of such peculiarities may be helpful. For example, Ibn Ḫallikān often uses phrases li-l-hiǧraŧ or min al-hiǧraŧ with the true 1st-century date statements (which is still 75-80%)—and such markers can be worked into the algorithm; other authors—about half a dozen that I checked thoroughly—use such additional phares only occasionally. Other peculiarities are too complicated and cannot be resolved with simple algorithms. For example, Ibn Ḳāḍī Šuhbaŧ occasionally “spells” out ones in his date statements to ensure that his readers get it right: sanaŧ sabʿ bi-taḳdīm al-sīn wa-ʿišrīn …, “the year seven, with sīn in the beginning…”), which, again, breaks the general pattern for date statements. The most complicated issue, however, is that even for a scholar it may occasionally be difficult to figure what century a certain date refers to (for example, when a biographee was born close to the middle of one century and died close to the middle of the next one). Natural languages will always pose such difficulties, yet, the results produced with the offered approach are quite suitable for the goal: even when we do not get the exact results, we are still getting close enough to the benchmark for a useful distant reading of a large corpus.

    The precision of results also varies because of differencies in book structure. We get more precise projections for books organized alphabetically—in this case authors cannot afford to use too many incomplete dates (see graphs for the Hadiyyaŧ al-ʿārifīn and Wafayāt al-aʿyān above); and less precise for books organized chronologically. It would make sense to develop different subroutines for processing texts based on their organization. Having robust metadata on each text would help triggering analytical routines adjusted to various peculiarities, although the structure of a book can be inferred computationally (on this see below). Additionally, a more precise logic can be implemented if our texts are properly divided into logical units. Thus, in a book organized alphabetically, the analysis of dates would be limited to a single logical unit, while in a book organized chronologically the precision of analysis can be inforced by looking into date statements in the neighboring units. At this point, results are provocatively suggestive—but in most cases some familiarity with a specific book will help make sense of its graphs.

    Complementary coverage of “continuations”

    Date statements may also offer other useful insights into Arabic historical sources. Comparing chronological coverage of different texts may offer an illustration of how text related to each other. Graphs below show a few examples of how certain texts are overlapping chronologically with their “continuations” (ḏayl, takmilaŧ, ṣilaŧ) and are complemented by them.

    Complementary coverage of “continuations”. [Top left] al-Ḏahabī’s Taḏkiraŧ al-ḥuffaẓ and its three ḏayls. [Top right] Ibn Abī Yaʿlá’s Ṭabaḳāt al-ḥanābilaŧ continued by Ibn Raǧab’s Ḏayl ʿalá Ṭabaḳāt al-ḥanābilaŧ. [Bottom left] Ḥaǧǧī Ḫalīfaŧ’s Kašf al-ẓunūn continued by Ismāʿīl Bāšā al-Baġdādī’s Iḍāḥ al-maknūn fī ḏayl ʿalá Kašf al-ẓunūn. [Bottom right] al-Ḫaṭīb’s Taʾrīḫ Baġdād continued by Ibn Naǧǧār’s Ḏayl (excerpted by Ibn al-Dimyāṭī in his al-Mustafād min Ḏayl Taʾrīḫ Baġdād).
    Complementary coverage of “continuations.”Taʾrīḫ mawlid al-ʿulamāʾ wa-wafayati-him of Ibn ʿAbd Allãh al-Rabaʿī (d. 397/1006 CE) is another interesting example, since we have its “continuation”, Ḏayl taʾrīḫ mawlid al-ʿulamāʾ wa-wafayati-him of ʿAbd al-ʿAzīz al-Kattānī (d. 466/1073 CE), and “the continuation of the continuation”, Ḏayl ḏayl taʾrīḫ mawlid al-ʿulamāʾ wa-wafayati-him of Hibaŧ Allãh al-Akfānī (d. 524/1130 CE). The graph vividly demonstrates how these collections complement each other chronologically.

    Date statements and the structure of books

    Patterns of date statements distribution across texts—in other words, if we graph dates in the order they occur in a text—can also tell us a lot about the structural organization of books. As the illustrations below show, alphabetical and chronological structures have distinct visual patterns. Such patterns can be helpful in assessing new corpora and identifying texts relevant for specific research purposes. Different routines can be developed for the identification and analysis of texts of other forms and genres.

    Note on graphs below: Each line represents a date statement, where the length of the line corresponds to the year that a date statement refers to. The left side of each graph is the beginning of the book; the right one—its end. Regression analysis—here visualized with the red line for linear regression, and the blue one for LOWESS regression—can be used for identifying the patterns of distribution without graphing. (1st-century dates were removed to make patterns more clear.)

    Distribution of dates across historical texts: Dates in the Taʾrīḫ Dimašḳ (top) are randomly distributed across the entire length of the text, which corresponds to its alphabetical organization; the same pattern can be seen in the al-Wāfī bi-l-wafayāt (bottom), which is also organized alphabetically.
    Distribution of dates across historical texts: Dates in the Taʾrīḫ al-islām, which covers the period of Islamic history up to 700/1300 CE, display a clear rising pattern, which reflects its chronological organization.
    Distribution of dates across historical texts: Dates in the Hadiyyaŧ al-ʿārifīn display a zig-zag pattern, which reflects its alphabetical organization, where biobibliographical records within each letter are organized chronologically (This last thing was quite a discovery—even though I have spent quite a lot of time working with this text, I did not realize that biographies within each letter are organized chronologically until I saw this graph).

    Concluding remarks

    One thing that must be voiced is that if we had a corpus properly prepared by scholars and for scholars that would include robust metadata and texts tagged into logical units, the results of such an experiment would have been significantly more precise and reliable, not to mention that such a corpus would also allow to run a number of other exploratory experiments. To put it differently, we—scholars who study the premodern Islamic world, and who are actively using collections developed in Arab countries and Iran for non-academic purposes (and let’s be honest, most of us do)—must invest time and effort into the development of a digital library that would allow all of us to engage in methodologically novel research. Such a library would also allow to build on the each other’s research more consistently, which would also help to forge a new collaborative culture that will be beneficial to the entire field.

    Appendix I: Exploring coverage of historical sources

    You can explore the chronological coverage of historical texts using Chronoplot (it may take a moment to load). Current data includes about 3,000 texts (including versions of the same text from different libraries). Keep in mind the following:

    1. Each text has a unique identifier: letter + number, where the former refers to a collection, and the latter—to the number of a text in that collection:
    2. Each text has three variations of date statement distribution. (Consider comparing variations for the text with the same identifier.) Texts of the same title from different collections occasionally give different distributions (especially when electronic texts are based on different printed editions).
      • A— unmodified dates (“1st century problem”);
      • B— updated dates (“single pass”);
      • C— updated dates (”double pass”)
    3. Selector (right) can be used to select titles for graphing their chronological coverage. Choosing multiple titles will allow to compare their coverages.
    4. Filter (right top) can be used to find specific titles: type a part of an author’s name or a book’s title, and the list will be filtered to show only items that have your keywords.
    5. Linetype (right bottom) is a drop-down menu that offers several ways graphing the results. The most appropriate linetype for displaying chronological coverage is “step-before,” since it shows the frequencies of date statements per 50-year periods in the most clear manner. However, this works well only for single texts. For comparative purposes “monotone” seems to be a better option.

    Appendix II: Exploring coverage of historical periods

    The table below lists sources by frequencies of date statements. Like Chronoplot, this table also has three variations of each text (A, B, C). Since variations A, B, and C differ only in how dates are distributed across periods, the initial table shows only variation A. Selecting a particular century will show only texts (with variations) that have dates for those periods.

    Metadata on texts is not always complete. The missing information may be available online—where applicable, links to the online manifestations of texts are provided.

    By centuries:



    0 0

    Learning classical Arabic is a long process. Most of us took great pleasure in advanced reading classes with our professors, but, often struggling with an overwhelming volume of new vocabulary, we also—at least occasionally—had a feeling that a traditional method is not necessarily the most effective one. While advanced students usually overcome this difficulty by their sheer passion for the subject, the introduction of excessive vocabulary creates a serious obstacle to less advanced yet capable students.

    image-right

    Pervasive availability of electronic texts and computational methods of text analysis allows us to rethink how we teach difficult languages. We can identify the most frequent features within a corpus and focus our attention on them. For example, the 100 most frequent lexical items constitute about 56% of the entire vocabulary of over 34,000 Prophetic sayings (ḥadīṯ) from the Six [Sunnī] Collections (al-kutub al-sittaŧ, approximately 2.8 million words). Relying on such data, one can generate a frequency-based reader that will introduce students to the shortest texts with the most frequent vocabulary and grammatical structures. With a paced increase in difficulty of texts and incremental expansion of vocabulary, students are capable of digesting much larger volumes of text both in class and at home, and such an extended exposure enables students to internalize the authentic language more efficiently. For example, in the course of one semester, we managed to cover about 400 ḥadīṯs, while at the same time reviewing the grammar of classical Arabic and having regular discussions of thematic readings that helped students to understand the cultural importance of the Ḥadīṯ across almost 14 centuries of Islamic history.1

    While developed primarily with classical Arabic in mind, the approach is actually universal and can be used for any language. It works best with serialized texts—that is a large corpus of relatively short text of the same type (in the case of Arabic that would be ḥadīṯ collections, chronicles, biographical collections, poetic anthologies, contemporary newspapers, etc.). Considering that in terms of vocabulary various forms and genres may differ from each other quite significantly (Figure 1 shows that such difference may go up to 80%!), this method can be used to introduce students to the language of particular genres in the most efficient manner. Courses based on such readers can be a valuable addition to any language program and will be particularly welcomed by graduate students who often face the need to develop their readings skills as quickly and efficiently as possible.

    Figure 1. The matrix shows lexical overlap across the frequency lists (top 3,000 items) that represent large thematic specimens of Arabic language. The specimens are arranged chronologically, staring with the earliest (right-top corner, 9th century) to the latest (20th century). The most dramatic lexical difference is between al-Kutub al-Sittaŧ, the Six [Sunnī] Collections of ḥadīṯs, and al-Šarḳ al-awsaṭ, the modern newspaper: the frequency lists of these two sources (again, top 3,000 items) share only 20% of word forms (tokens). Even among the “classical” works the lexical distance is quite significant, with the percentage of shared vocabulary fluctuating mainly between 38% and 58% (for the interquartile range).

    Texts compared: al-Kutub al-Sittaŧ (2,8 mln. words), the 6 Sunnī collections of Ḥadīṯ (~9th century CE); Tafsīr al-Ṭabarī (or Ǧāmiʿ al-bayān, 3 mln. words), a commentary to the Qurʾān of al-Ṭabarī (d. 310/922 CE); Kitāb al-Aġānī (1,5 mln. words), a poetic anthology of Abūl-l-Faraǧ al-Iṣbahānī (d. 356/967 CE); al-Futūḥāt al-Makkiyyaŧ (1,7 mln. words), an extensive Ṣūfī text of Ibn al-ʿArabī (d. 638/1240 CE); Fatāwá Ibn Taymiyyaŧ (2,9 mln. words), a collection of legal decisions and epistles of Ibn Taymiyyaŧ (d. 728/1327 CE); Taʾrīḫ al-Islām (3,2 mln. words), a biographical collection and chronicle of al-Ḏahabī (d. 748/1347 CE); Maǧallaŧ al-Risālaŧ (16 mln. words), an early 20th-century Egyptian literary journal; Tafsīr al-Mīzān (2,3 mln. words), a modern Šīʿī commentary to the Qurʾān of al-Sayyid al-Ṭabāṭabāʾī (d. 1981 CE); and al-Šarḳ al-Awsaṭ (2,5 mln. words), a modern Arabic newspaper (collected by Tariq Yousef from http://aawsat.com/).

    Description of the method

    The overall procedure is rather simple and runs as described below.

    Step I. Ḥadīṯ collections were downloaded from http://sunnah.com/. Then, initial texts were reformatted and normalized.2 (There are multiple way how specimens of other genres can be obtained and the processed for a similar reader).

    Step II. All vocabulary from the corpus was collected and converted into a frequency list. This list was then converted into a ranking list, where the most frequent item receives rank 1, the second—2, the third—3, and so on; items with the same frequency are assigned the same rank. It should be noted that vocabulary items have not been parsed with a morphological analyser, so different forms of the same word are treated separately (i.e., ḳāla, ḳīla, ḳālat, fa-ḳāla, etc. have their own frequencies and ranked separately). The main reason for not using the results of automatic morphological analysis is largely technical, since existing morphological analyzers are meant to work with modern standard Arabic and do not perform well on classical Arabic.3 At the same time, using frequencies of word forms (tokens) rather than dictionary forms (lexemes) has its advantages, since more frequent forms will be given more frequently in the reading materials (such as, for example, very frequent ḳāla [sing. masc.] vs. rather rare ḳālā [dual masc.]).4

    Step III. The average mean of ranking values was calculated for each ḥadīṯ. The resultant values then served as difficulty indices, where texts with the most frequent vocabulary would have the lowest average means, and vice versa. These indices were then used as sorting values that allowed rearranging all 34,000 ḥadīṯs by the difficulty of their vocabulary. The advantage of the average mean here is that even a single low frequency lexical item increases the difficulty index of a text, which is pushed down the list. This approach turned up a couple of unforeseen positive effects. First, as the length of a text increases so does the probability of more rare lexical items—as a result, the “easiest” texts are also the shortest ones. This convenient outcome allows students to begin with the shortest texts and move gradually to the longer ones. The second effect is that the most frequent vocabulary also tend to appear in the most frequent grammatical and syntactic structures.

    Step IV. The rearranged collections of ranked ḥadīṯs was not quite useable, since this method also groups together items that are almost the same. Here manual input was required to exclude ḥadīṯs that are too similar.

    Step V. At last, the selection of ḥadīṯs was converted into format and typeset into the reader in front of you. As you will see, quite a few ḥadīṯs in the beginning of the reader feature only isnāds, “the chains of transmitters”, and do not have matns, the actual texts of ḥadīṯs. I used these matn-less ḥadīṯs to introduce students to the concept of transmission of knowledge in Islamic culture, which most were not familiar with; next time around I will modify the reader to avoid having very similar texts next to each other, which can be done by the retagging of the selection of ḥadīṯs and regenerating the entire reader anew.

    In the classroom

    In my teaching, I used this reader in combination with ‘micropublications’, which provided each student with a thorough practice of foundational skills necessary for mastering the language: for each ḥadīṯ students provided full vocalization, morphological stemming, and translation aligned with its Arabic original. Such ‘micropublications’ help monitoring students’ progress, and, later, can be used to automatically grade such assignments, thus freeing up time for in-class discussions. Last but not least, by producing these micropublications, students make a valuable contribution as they generate training data that can be used for various teaching and research purposes.

    Footnotes

    1. “Classical Arabic through the Words of the Prophet” (Tufts University, Winter/Spring 2015), with the following two additional readings: W. M. Thackston, An Introduction to Koranic and Classical Arabic: An Elementary Grammar of the Language (Bethesda, Md.: Ibex Publishers, 2000), Jonathan Brown, Hadith: Muhammad’s Legacy in the Medieval and Modern World (Oxford: Oneworld, 2009).

    2. On normalization, see: Nizar Y. Habash, Introduction to Arabic Natural Language Processing ([San Rafael, Calif.]: Morgan & Claypool Publishers, 2010), 21–23.

    3. For example, Buckwalter Morphological Analyser, which has been tested with this corpus (using Perseus morphological services), returned no results for about 25% of tokens, single results for another 25%, and more than one for the rest 50%. Needless to say, such results are hardly useable for our purposes.

    4. An ability to recognize rare forms is important, of course, but it can be practiced through grammatical and morphological exercises (examples can be found at the end of the reader).


    0 0

    The OpenITI team—building on the foundational open-source OCR work of the Leipzig University’s (LU) Alexander von Humboldt Chair for Digital Humanities—has achieved Optical Character Recognition (OCR) accuracy rates for classical Arabic-script texts in the high nineties. These numbers are based on our tests of seven different Arabic-script texts of varying quality and typefaces, totaling over 7,000 lines (~400 pages, 87,000 words). These accuracy rates not only represent a distinct improvement over the actual accuracy rates of the various proprietary OCR options for classical Arabic-script texts, but, equally important, they are produced using an open-source OCR software called Kraken (developed by Benjamin Kiessling, LU), thus enabling us to make this Arabic-script OCR technology freely available to the broader Islamic, Persian, and Arabic Studies communities in the near future. Unlike more traditional OCR approaches, Kraken relies on a neural network—which mimics the way we learn—to recognize letters in the images of entire lines of text without trying first to segment lines into words and then words into letters. This segmentation step—a mainstream OCR approach that persistently fails on connected scripts—is thus completely removed from the process, making Kraken uniquely powerful for dealing with a diverse variety of ligatures in connected Arabic script. In the process we also generated over 7,000 lines of “gold standard” (double-checked) data that can be used by others for Arabic-script OCR training and testing purposes.

    Our working paper can be found on Academia.edu (By: Benjamin Kiessling, Matthew Thomas Miller, Maxim Romanov, Sarah Bowen Savant).

    image-right

    Kraken ibn Ocropus. Based on a depiction of an octopus from a manuscript of Kitāb al-ḥašāʾiš fī hāyūlā al-ʿilāj al-ṭibbī (Leiden, UB : Or. 289); special thanks to Emily Selove for help with finding an octopus in the depths of the Islamic MS tradition.

    0 0

    Biographical and bibliographical texts can offer a valuable insight into the process of cultural production in the Islamic world. One of the most relevant texts is the Hadiyyaŧ al-ʿārifīn (“The Gift to the Knowledgeable”)—a bio-bibliographical collection written by Ismāʿīl Bāšā al-Baġdādī (d. 1338/1919 CE). Although de facto the text is modern, it follows very closely in the footsteps of medieval texts of this kind and is effectively the part of the tradition; additionally, chronologically, we get the most extensive coverage from this collection as it covers the period from the beginning of Islam in the 7th century CE up to the end of the 19th century CE.

    From the very little that we know about him,14 Ismāʿīl Bāšā wrote two extensive bibliographical texts—the first one, Īḍāḥ al-maknūn fī Ḏayl ʿalá Kašf al-ẓunūn, is the continuation of the famous Kašf al-ẓunūn of Ḥāǧī Ḫalīfaŧ (d. 1067/1656 CE),15 which mirrors its structure with the main unit being the work and them all organized alphabetically; the second one is the Hadiyyaŧ al-ʿārifīn, which contains essentially the same information, but grouped into biographical records, where all works attributed to a given author are listed after a short biography.16 The Hadiyyaŧ al-ʿārifīn is organized alphabetically, and then chronologically within each letter.17

    Although one cannot possibly expect for such a collection to be comprehensive and exhaustive, this is the largest bibliography of books written in the Islamic world that we have available. So, we can still hope to get valuable insights into cultural production—the appearance of new works—in the Islamic world up until the beginning of the 20th century. For the sake of space and the mere fact that the analysis of this collection deserves a separate study, I will focus on broad spatial and chronological patterns that can be discerned in the data.

    Insight 1: cultural production over time

    Figure 5. Chronological distribution of authors

    First of all, our algorithmic analysis allows us to get a better understanding of the overall coverage of the collection itself: it includes almost 8,800 authors and over 40,000 book titles—with most authors being attributed 1 to 4 titles (interquartile range). The overall chronological distribution of authors (Figure 5) displays a steady upward trend up until 1200/1785 CE, reflecting the general historical situation: as the Islamic world keeps expanding geographically and the Muslim population growing, we find more individual getting involved in the process of cultural production.

    Displaying the same trend for the period up 1200/1785 CE, the graph of books (Figure 6) makes the prominent early period (200–450 AH / 815–1058 CE) more noticeable. Although this period is usually strongly associated with the translation movement from Greek into Arabic,18 it is probably even more important for the formation of Islam as a religious system: particularly for the development of the Ḥadīṯ canon19 and the crystallization of theological views.20 Spikes are also due to a few very prominent polymaths: al-Suyīṭī (d. 911/1505 CE)—585 works; Ibn ʿArabī (d. 638/1240 CE)—425 works; al-Kindī (d. 256/870 CE)—256; al-Madāʾinī (d. 225/840 CE)—223 works; al-Nābulusī (d. 1143/1730 CE)—204; Ibn al-Jawzī (d. 597/1200 CE)—201 works, and quite a few other prolific authors.

    Figure 6. Chronological distribution of books

    The decline of both graphs after 1200/1785 CE most likely indicates the unavailability of bibliographical information to our author. The geographical coverage of the collection also starts shrinking roughly at the same period. It should be noted that all chronological datasets tend to exhibit this trend. For example, the trend can be observed in al-Ḏahabī’s own continuation, Ḏayl, to his massive “The History of Islam” (Taʾrīḫ al-islām), where the number of biographies per period drops dramatically. One can equally see this in Brill’s bibliographical database Index Islamicus as well as in Harvard Open Metadata on 12 million books that Harvard libraries hold. The only difference is that the lag gets shorter as we get closer to our time—for premodern Arabic sources this lag is 100 to 150 years; in modern datasets—10 to 20 years.

    Figure 7. Regional Contributions.

    Splitting our data geographically—Figure 7—we can also discover which regions played the leading role in cultural production. What we discover from the results is that, as we suspected, the collection does not cover all the regions of the Islamic world, particularly regions that became part of the Islamic world in the later periods and in geographical terms remained peripheral to the core: Subsaharan Africa, the Indonesian Archipelago, the Volga region, and Eastern Europe. At the same time, all core regions—the historical heartlands—of the Islamic world are covered quite well.

    It should be pointed out that the bar chart here shows the presence of authors in those regions, as many of them traveled (sometimes extensively) and composed their books at different locations. In other words, our biographee—who lived in Nishapur, but died in Mecca—appears both in the column of Iran (Īrān) and that of Arabia (Jazīraŧ al-ʿarab). Such treatment of data is also justified because regions in their prime tend to attract people from less prosperous ones.

    Figure 8. Most prominent Islamic regions over time.

    We can get a better understanding of regional contributions by graphing regional data chronologically—Figure 8 shows the top five contributing regions: Anatolia (Rūm), Iraq (al-ʿIrāq), Iran (Īrān), Syria (al-Šām), and Egypt (Miṣr) are homes to the highest number of individuals engaged in cultural production across the Islamic world. The chronological distribution of authors in those regions (as well as in the regions that are not graphed here) display a rather distinct pattern: cultural production is on the rise during economic and political stability, usually marked by the early rule of strong dynasties: the ʿAbbāsids in Iraq; dynasties of the “Iranian intermezzo”, followed by the Tīmūrids and the Ṣafawids in Iran; the Mamlūks in Syria and Egypt;21 the Ottomans in Anatolia. It should be noted, however, that the increase in cultural production in these cases is not necessarily due to rulers’ patronage, but, rather due to the stability and predictability of social and economic life that their rule brings about. Although many rulers did act as patrons of “fine literature,” most books in the Hadiyyaŧ al-ʿārifīn deal with religious subjects—Qurʾānic exegesis, “words of the Prophet” (Ḥadīṯ), Islamic law, etc.—and they were composed more in the framework of the development of local religious communities, whose florescence depended on the overall political and economic stability. In this regard, the example of Iraq might be quite telling: the early period of ʿAbbāsid rule is marked by a very significant rise, which comes to a halt when the ʿAbbāsids lose their sovereignty and become the puppets first, of their generals, then—the Būyids, and then—the Saljūqs, regaining their power only briefly at the end of their rule, which is ended dramatically by the Mongol invasion. Needless to say, the real historical picture is always more complicated than space of this article allows.

    Insight 2: Cultural Connections

    Our bio-bibliographical data also offers a significant amount of geographical information, with which one can model geographical networks of connections. A network of an individual can be represented by connecting all places mentioned in that individual’s biography—Figure 9 shows the geographical network from our sample biography, where possible paths are generated from the route network of that period using the shortest path (Dijkstra algorithm) and the optimal path (modified Dijkstra algorithm that avoids stretches with a small number of settlements along the way).

    Figure 9. Geographical network of the biographee from the sample biography (using our al-Ṯurayyā Gazetteer, (https://althurayya.github.io/).

    Figure 9. Geographical network of the biographee from the sample biography (using our al-Ṯurayyā Gazetteer, (https://althurayya.github.io/).

    For our purposes, however, a bit more simplified approach for modeling the network will work better. First of all, we want to move from the level of settlements to the level of regions: they become the nodes, which are connected with each other directly—as the crow flies—without using route networks.22 In the case of our sample biography, the network is thus simplified to a single arc between Iran and Arabia. One can then combine route networks of a particular group of individuals in order to see a broader pattern. Arguably, by combining individual networks from specific periods—with every shared node becoming bigger, and every shared edge thicker—one can get an idea of how the Islamic world was connected in that particular period, and more interestingly, what constituted its core: namely, the constellation of most prominent and inter-connected regions.

    Figure 10. The Iraqi-Iranian core in the 12th century CE.

    Figure 10. The Iraqi-Iranian core in the 12th century CE.

    Practically up until 1200 CE (Figure 10), Iraq and Iran remain the core of the Islamic world:23 they are strongly connected with each other—a very significant number of the men of letters (mostly, religious scholars who write predominantly in Arabic) come from Iran during this period. Spain (al-Andalus), which, based on our data, thrives during the 10–13th centuries, forms more of its own core with North Africa (al-Maġrib). The West and the East are too far from each other to maintain strong connections.

    Figure 11. Massive migrations of the 13th century CE.

    Figure 11. Massive migrations of the 13th century CE.

    During the 13th century CE (Figure 11), we find the strongest connections among the eastern and western regions of the Islamic world. Although one might expect this to indicate a certain tranquility that permitted travel, what we see is in fact the result of the crises both in the East and the West of the Islamic world. In Spain, Muslims are losing their ground and a significant number of scholars start moving east to North Africa, Egypt and Syria; Iran and Iraq are suffering from their own crises, most notably—“The Big Chill” of the 11th–early 12th centuries CE, which destroys the economic prosperity of the Iranian regions and pushes nomads from the Turco-Mongolian steppe further and further into the Iranian plateau.24 The Mongols usually take the blame for the destruction of the great cities of Iran and Iraq (most notably, Baġdād), however, judging by the data from biographical collections, by the time they show up and deliver the finishing blow all the previously prominent urban centers are long in decline. It is during this period that we find Iranians and Iraqis leaving their homes, relocating to Syria and Egypt, which in the two centuries to follow form a new core under Mamlūk rule (Figure 12).

    Figure 12. New Mamlūk core of the 14th and 15th centuries CE.

    Figure 12. New Mamlūk core of the 14th and 15th centuries CE.

    The 16th century marks a significant reconfiguration of the Islamic world: most notably with the rise of the “gunpowder empires”—the Ottomans in Anatolia (Rūm) and their successful conquests of the former core—Mamlūk Syria and Egypt; the Ṣafawids in Iran, and the Mughals in India (not graphed here). Figure 13 displays this reconfiguration marked by the rise of the Ottoman Empire and the reorientation of Iran, when significant numbers of Iranian scholars begin moving to Anatolia, but even more so to India.25

    Figure 13. Reconfiguration of the 16th century CE.

    Figure 13. Reconfiguration of the 16th century CE.

    The last map—Figure 14—shows the split of the Islamic world into two distinct cores of the Ottoman Empire which gains control over almost entire Arab world and the Indo-Iranian core. This split begins in the 17th century and remains equally distinct in our data up until the end of the 19th century.

    Figure 14. The Turco-Arabic and Indo-Iranian cores in the 18th century.

    Figure 14. The Turco-Arabic and Indo-Iranian cores in the 18th century.

    * * *

    These graphs and maps show only a fraction of what can be done with the data extracted from a single biographical collection.26 The next logical step is to study data from all available biographical collections—this step, however, requires a significant level of formalization and infrastructural development.


    NB: This is an excerpt from: Romanov, Maxim. “Algorithmic Analysis of Medieval Arabic Biographical Collections.” Speculum 92, no. S1 (October 2, 2017), S226–46. doi:10.1086/693970. The full text is available in open access @ http://www.journals.uchicago.edu/doi/full/10.1086/693970


    Footnotes

    1. See, Witkam, J.J., “Ismāʿīl Pas̲h̲a Bag̲h̲dādli̊”, in EI2–Online. For the edition of this text, see: Ismāʿīl Bāšā al-Baġdādī, Hadīyaŧ al-ʿārifīn asmāʾ al-muʾallifīn wa-aṯār al-muṣannifīn, 6 vols. (Bayrūt: Dār al-kutub al-ʿilmīyaŧ, 1992).

    2. He is also known as Kātib Čelebi, see: Şaik Gökyay, Orhan, “Kātib Čelebi”, in EI2–Online.

    3. The majority of works listed in the Hadiyyaŧ al-ʿārifīn are in Arabic, “the Latin of the Islamic world”, although there is also about 10% of books written in Persian and Turkish (the language is either explicitly mentioned, or the title of a work includes a Persian or Turkic word—most commonly, nāmah, Pers./Turk. “book”); Persian and Turkish works are not excluded from the analysis.

    4. It is worth pointing here that, when it comes to biographical material, alphabetical organization is secondary in Islamic culture; the primary form of organization would be chronological, divided into “generations” or “cohorts” (sing. ṭabaqaŧ)—authors of later generations would often take this information, edit, supplement and reorganize alphabetically. See, Franz Rosenthal, A History of Muslim Historiography (Leiden: E. J. Brill, 1952), passim: al-Saḫāwī’s al-Iʿlān bi-l-tawbīḫ, translated in Rosenthal’s book, is particularly rich on notes about who updated and reorganized whose work.

    5. Dimitri Gutas, Greek Thought, Arabic Culture: The Graeco-Arabic Translation Movement in Baghdad and Early ʻAbbāsid Society (2nd-4th/8th-10th Centuries) (London ; New York: Routledge, 1998).

    6. See, for example, “Phase 3: The age of ‘six books’” (c. 200–400/912–1009) in: Scott C. Lucas, Constructive Critics, Ḥadīth Literature, and the Articulation of Sunnī Islam: The Legacy of the Generation of Ibn Saʿd, Ibn Maʿīn, and Ibn Ḥanbal (Leiden ; Boston: Brill, 2004), 73–86.

    7. According to the Hadiyyaŧ al-ʿārifīn, about 90% of almost 500 “refutations” (Ar. radd) of different groups and specific beliefs were written during this period (peaking 250–450 AH / 864–1058 CE).

    8. The rule of the Fāṭimids in Egypt marked the shift in the ideology—from Sunnism to Ismāʿīlī Shiʿism—which featured the rise in numbers of Ismāʿīlī writings, however, these numbers are overshadowed by the decline in Sunnī writings—as well as in Sunnī communities in general—in Egypt. On Ismāʿīlī authors, see Ismail K. Poonawala and Teresa Joseph, Biobibliography of Ismāʿīlī Literature, Studies in Near Eastern Culture and Society. (Malibu, Calif.: Undena Publications, 1977), 467–69.

    9. The problem with the route network is that they change over time and it is very difficult to recreate route networks for all the periods covered in our collection; more importantly, however, route networks will forefront the most traveled sections of the network, rather than the density of connections among the regions.

    10. As I show elsewhere, on data from a significantly larger biographical collection, the core for this period is more complex, particularly since what we come to understand as “Iran” in that period is several major provinces, with almost each one of them being similar in size to Iraq. See, Maxim Romanov, “After the Classical World: The Social Geography of Islam (c. 600—1300 CE),” in ARS ISLAMICA: Festschrift in Honor of Stanislav Mikhailovich Prozorov, ed. Mikhail Piotrovsky Alikber Alikberov (Moscow: Russian Academy of Sciences (Institute of Oriental Studies) & “Vostochnaya Literatura”, 2016), 247–77.

    11. See, most notably, Richard W. Bulliet, Cotton, Climate, and Camels in Early Islamic Iran: A Moment in World History (New York: Columbia University Press, 2009).

    12. See, for example, Masashi Haneda, “Emigration of Iranian Elites to India During the 16-18th Centuries,” Cahiers d’Asie Centrale, no. 3 (October 1, 1997): 129–43, https://asiecentrale.revues.org/480.

    13. For more examples of such analysis of data from a different collection, see: Maxim Romanov, “Toward Abstract Models for Islamic History,” in The Digital Humanities and Islamic & Middle East Studies (Berlin, Boston: De Gruyter, 2016), 117–49, http://www.degruyter.com/view/books/9783110376517/9783110376517-007/9783110376517-007.xml.


    0 0

    Defining digital humanities is tricky. Our scholarship has been intrinsically digital for quite a few decades already, as we rely more and more on electronic storage to save, word processors to write, bibliography managers to organize, databases to consult, digital libraries to search and read. Living in the digital world, however, does not make us all digital humanists—if these digital entities are taken away, we will have their analog prototypes to fall back on, and beyond a certain level of inconvenience, this will not affect the way most of us do our scholarship. The transition to digital humanities must begin somewhere at the point where our humanistic inquiry starts to rely on the machine as the matter of methodological exigency.

    In some ways digital humanities is also a “no man’s land” that, within every national context, is most successfully claimed by scholars of national histories, literatures, and languages—by virtue of their higher numbers and the accessibility of their subjects to national funding agencies and the wider public. In practical terms, one’s primary field of academic inquiry, with its specific research questions and available source base, determines the set of computational approaches and thus defines a specific instance of digital humanities. (Without a primary field of academic inquiry we would be talking about technicians rather than scholars.) For example, although methods for analysis of video and audio recordings will be of little practical value to a scholar of premodern Islamic history, there is a lot to be gained from methodological areas such as computer vision,1 social network analysis, geographical information systems (GIS), and, most importantly, text analysis.

    To build a case for text analysis methods, let’s consider the example of the Taʾrīḫ al-islām (“History of Islam”) by al-Ḏahabī (d. 1348). This book of great length and coverage, whose 50 volumes2 containing 3.6 million words—the size of War and Peace six times over—trace the first 700 years of Islamic history through the description of historical events and some 30,000 biographies. Although a great number of modern scholars use this massive “obituary chronicle” as their major source, we hardly have a decent understanding of its inner organization. With a significant amount of Arabic historical texts available, we can employ the text-reuse identification method developed by David Smith of Northeastern University to build the equivalent of an x-ray image of this chronicle, which would shed new light on this text and raise a series of important historiographical questions.

    To begin with, we get a detailed perspective on the sources that al-Ḏahabī might have used: he mentions some forty of them that he used and in our x-ray we find traces of them (provided, of course, that we have a relevant text in our corpus, which in most cases we do) and other sources that he might have used, but failed to mention, for whatever reason. Figure 1 shows how passages common to al-Bayhaqī’s (d. 1066) Dalāʾil al-nubuwwaŧ (“Indications of Prophecy”) also feature in al-Ḏahabī’s text: the length of a black line corresponds to the number of words in an identified text reuse instance; the dense black block in the beginning of the book indicates the density of text reuse, also indicating that all these common passages occur exactly where we would expect them to appear—in the part of al-Ḏahabī’s text that deals with the period of Prophet’s life.

    Figure 1. Passages from al-Ḏahabī’s Taʾrīḫ al-islām traceable to al-Bayhaqī’s (d. 1055) Dalāʾil al-Nubuwwaŧ (111,436 words, 371 pages, 50% of instances 28–61 words). The graph shows the flow of the text of al-Ḏahabī’s work in one hundred-word chunks: the beginning of the book is on the left, the end of the book is on the right; the red lines indicate points where al-Ḏahabī moves on to the coverage of the next hijrī century. Black lines indicate instances of text reuse traceable to al-Bayhaqī’s text, and the length of a black line corresponds to the number of words in an identified text reuse instance. The dense black block in the beginning of the book indicates the density of text reuse—with most of it falling on the period up until 640 CE—which means that all these common passages occur exactly where we would expect them to appear: in the part of al-Ḏahabī’s text that deals with the period of the Prophet’s life.

    With the help of our x-ray, not only do we discover connections with practically all the sources that al-Ḏahabī mentions in his introduction, we are also able to gauge the extent to which he engaged with his sources. We find a very significant amount of common passages with Ibn ʿAsākir’s Taʾrīḫ Dimašq (“History of Damascus”)—an equivalent of over 800 pages (300 words per page, 246,000 words), with al-Bayhaqī’s Dalāʾil al-nubuwwaŧ (“Indications of Prophecy”)—some 370 pages, with Ibn al-Jawzī’s Kitāb al-muntaẓam (“The Book of Rightly-Ordered Things [about Histories of Kings]”)—some 280 pages, with al-Mizzī’s Tahḏīb al-Kamāl (“The Refinement of Perfection”)—some 270 pages, with al-Ḫaṭīb al-Baġdādī’s Taʾrīḫ Baġdād (“History of Baghdad”)—some 250 pages, and so on. (Keep in mind, however, that these numbers cannot be simply added up because there is a significant amount of reuse among these texts as well). In all cases 50% of identified shared passages are 25 to 60 words long! Seeing major biographical collections and chronicles on this list is not surprising, but al-Bayhaqī’s Dalāʾil appears to stand out. Our text reuse data suggests that the Dalāʾil is the most heavily reused text—these 370 “pages” amount to almost 20% of its volume (the share of Taʾrīḫ Dimašq, on the other hand, is barely 2.4% of its volume). This, however, does not necessarily mean that all passages common to al-Bayhaqī were taken by al-Ḏahabī directly from al-Bayhaqī’s Dalāʾil because there is always a possibility of a common source or a source between the Dalāʾil and the Taʾrīḫ al-islām. (In this particular case this is quite likely, since al-Ḏahabī lists the Dalāʾil among his main sources.) What it does mean is that our distant reading suggested to us a very interesting connection which deserves further close examination using more traditional methods.

    Next, we may attempt to assess the cumulative level of text reuse in al-Ḏahabī’s Taʾrīḫ al-islām. Altogether, the currently identifiable amount of text reuse—here we count each instance of text reuse only once, even if it is traceable to multiple sources—amounts to at least 23% of al-Ḏahabī’s text (750,000 words, 2,500 pages, with 50% of quotations within 25 to 59 words). If we look at his text, century by century, we discover that for almost every century that he covered, about 20-22% of his text can be traced to passages from his sources, with the exception of the 1st and the 7th Islamic centuries, where the share of text reuse amounts to 47.8% and 8.4% respectively. These numbers confirm, first, that this text is a compilation, and second, that it is the latest material that is least derivative. While there is a tendency to dismiss such “discoveries” as nothing that scholars don’t already know, it is important to stress that they allow us to transform “intuitive knowledge” into knowledge backed by a significant amount of textual evidence, which we can then use as a reliable premise to advance our analysis further—something that otherwise would not be possible.

    The importance of this seemingly trivial discovery is that, by quoting his sources so extensively, al-Ḏahabī effectively preserves their archaic language. For instance, when he writes about the 1st hijrī century, his narrative is dominated by quotations from texts written in the 3rd century; when about the 2nd, from the 3rd and the 4th; and when about the 3rd, from the 3rd, 4th, and the 5th, and so on. The discovery of these archeological layers of language indicates that al-Ḏahabī describes people and events with a language that is as close to contemporaneous as is feasibly possible in historiographical terms. (It is also likely that his own syntax and word choices are affected by the language of his sources.)

    Figure 2. Results of the rolling stylometry test. Three samples of 10,000 words were taken from the beginning (red), middle (green), and end (blue) of al-Ḏahabī’s Taʾrīḫ al-islām and used to test to what extent the “style” of these samples is similar to the rest of the book. The graph shows that the “early style” (red), which dominates the language of the 1st Islamic century, disappears completely by the end of the 3rd Islamic century, not reaching even the middle of the book. The style in the end of the book is completely different from that of the beginning of the book.

    A rolling stylometry test3 of the Taʾrīḫ al-islām further shows that al-Ḏahabī’s writing “style”—defined as a set of most frequent function words that form a writer’s “fingerprint”—changes completely by the end of this massive book: three samples of 10,000 words were taken from the beginning (red), middle (green), and end (blue) of the book and used to test to what extent the “style” of these samples is similar to the rest of the book. Figure 2 shows that the “early style” (red), which dominates the language of the 1st hijrī century, disappears completely by the end of the 3rd century, not reaching even the middle of the book. In stylometric terms this can be interpreted that the beginning and the end of the book were written by two different people.

    This discovery about the language of al-Ḏahabī’s Taʾrīḫ al-islām has far reaching implications about the data that he collected. For example, although the text comes from the 14th century and inevitably suffers from 14th-century biases when it comes to the representation of the past, the description of people and events—i.e., at the linguistic level—is not as anachronistic as one would think, and, arguably, these properties of the language give us ground to use the data from this text for modeling historical processes.4

    I mentioned earlier that the overall volume of text reuse in al-Ḏahabī’s Taʾrīḫ al-islām is at least 23%. Our initial text-reuse experiment was constrained by the format of our texts—or, to refer to the article in this roundtable, by the lack of a proper scholarly corpus. This means that we had to compare texts that were mechanically chunked into slices of one hundred words, and with such comparison we could have missed up to 20% of reused text; this circumstance also did not allow us to perform a more informative distant reading. As our OpenITI corpus develops and texts are supplied with logical markup (i.e., every chapter, section, subsection of a book is explicitly tagged), we will be able to run more precise and robust experiments. Comparing logical units of texts—for example, a biography with another biography—would open more opportunities for understanding how our texts were composed. For example, such an analysis will allow us to identify computationally which biographies al-Ḏahabī included from any given source and which he omitted. Knowing this would allow us to assess—on the largest scale possible—not only his selection criteria, but also what he suppressed from selected biographies and how he modified them.5 Pushing the point further, this can be accomplished for all historical titles in the OpenITI corpus, which is likely to significantly change our understanding of the Islamic historiographical tradition.

    With all this said, the machine will never replace traditional training. No proper distant reading experiment can be designed without a deep understanding of the subject in question, which can only come from a fair share of close reading. The machine is just another tool in our methodological toolbox which allows us to do something that other methods don’t. The machine will never ask novel historical questions, but it will enable us to do so.


    NB: This essay is coming out in the next issue of IJMES as a part of the roundtable on Digital Humanities in Middle East Studies. Other contributions will include essays by Matthew Miller, Elias Muhanna, Sarah Savant, Sabine Schmidtke, and Columba Stewart.


    Footnotes

    1. We already have methods to make manuscripts searchable (although in a limited way) and soon we should be able to group manuscripts by similarities in handwriting as well as to identify manuscripts written by the same hand. See, for example, Mike Kestemont and Dominique Stutzmann, “Script Identification in Medieval Latin Manuscripts Using Convolutional Neural Networks,” Digital Humanities 2017: Book of Abstracts (Montreal, August 10, 2017), 283–85. 

    2. al-Ḏahabī, Taʾrīḫ al-islām, ed. ʿUmar ʿAbd al-Salām Tadmurī, 1st ed., 52 vols. (Bayrūt: Dār al-kitāb al-ʿarabī, 1990–99). 

    3. See, the website of the “Computational Stylistic Group” (accessed on October 6, 2017), https://sites.google.com/site/computationalstylistics/projects/testing-rolling-stylometry; see also, Maciej Eder, Jan Rybicki, and Mike Kestemont. “Stylometry with R: a package for computational text analysis.” R Journal, 8(1), 2016, 107-121. 

    4. See Maxim Romanov, “Toward Abstract Models for Islamic History,” in The Digital Humanities + Islamic Middle Eastern Studies, ed. Elias Muhanna (Berlin: De Gruyter, 2016), 117–49; and Romanov, “Algorithmic Analysis of Medieval Arabic Biographical Collections,” Speculum 92/S1 (October 2017), S1-21 (available in open access at http://www.journals.uchicago.edu/doi/full/10.1086/693970). 

    5. See Maxim Romanov, “Observations of a Medieval Quantitative Historian?,” Der Islam (forthcoming in 2018). 


    0 0

    The Middle East Studies Association (MESA) celebrated its 50th anniversary last year. Although it is not as large as such associations as AAR and AHA, it is very dear to most of us who are engaged in the study of Middle East. Those who attended the annual meeting in Boston must have seen an attempt to visualize academic genealogies of scholars of Middle East, which sounds like a very interesting idea, but will take forever to realize. At the moment the genealogy looks like unconnected snowflakes (here is Franz Rosenthal’s snowflake of students),1 but we all realize that the real life is way more complicated and more interesting than that (Perhaps, MESA should add another field to academic profiles, where we’d need to provide the name of our adviser(s?!)—or, better, the entire dissertation committee—then we will be able to recreate a living genealogy of the field with all its complexities and subtleties).

    While the genealogy project is too far away from its maturity, we do have some other interesting data through which we can get a glimpse into the academic community that grew around MESA: we can use programs of annual meetings–in combination with academic profiles of MESA members (and, perhaps later, abstracts of their papers)–to gauge whether and how we are connected with each other across the field, how we form collegial communities and maintain them, and how these communities take new shapes in reaction to internal developments in the academy and to external political and social factors. The program data is limited in many ways, as it does not capture such connections as who attends panels, who engages into discussions with panelists, nor does it anywhere close to revealing who we hang out with between panels at the most attended areas of the conference. Additionally, only a limited number of programs is relatively accessible at the moment and our insight can go back in time only 8 years (2009–2017; all data used below is openly available at the official MESA Website).

    Below you can find explanations of how the network was generated as well as summaries for each major intellectual community that can be identified in the data. You can also browse the network and find yourself there.

    The Network of MESA Panels (2009-2017). The dynamic and searchable network is available here.

    On data and method

    The data has been collected from the official MESA website—only what is openly accessible has been included into the dataset (in other words, if you chose not to make your profile public, your data is not in the dataset). The data has been downloaded with wget and then reformatted with a series of scripts written in python. The data was then reformatted into a network model where scholars are connected through all the panels and sessions they participated in together. The weight is based on the number co-panelists. Scholars with multiple roles (such as an Organizer and a Presenter) were given more weight, to stress their active engagement. The network visualization has been built with Gephi, whose community detection algorithm was used to identify closely connected groups of scholars, orintellectual communities; the results were saved into an interactive network using Sigma.js plugin. Reports were then algorithmically generated for all major intellectual communities (with the size of at least 1%) using R, R Markdown, and the knitr package. Some data cleaning—mainly to unify different spellings—was done with OpenRefine.

    What’s in community reports?

    As stated above, communities were identified with the community detection algorithm in Gephi. Reports summarize formal descriptions that MESA members provide in their open academic profiles at the MESA website (disciplines, subfields, geographical areas of specialization, working languages, and degree granting institutions). Some members preferred not to share their profiles online, so their data was not included. In general, the most salient characteristics of communities offer a decent idea of what a given community is about. For example, Group 12 appears to be mostly about politics in modern Middle East; Group 15—mostly about premodern Islamic history and Islamic studies; Group 1—mostly about Ottoman studies, and so on.

    Interestingly, there are multiple communities with the same—or, at least, comparable—thematic/subject foci, but they differ by alma maters, which shows that the formation of academic communities is tied to brick-and-mortar institutions (For example, Group 1 and Group 24 both appear to focus on Ottoman studies, but most of the scholars from the first group have degrees from American institutions, while the second one is dominated by scholars with degrees from Turkish institutions.

    The networks of specific scholars often cross into multiple communities.

    The most salient features of each community are statistical outliers on the high end of data distribution: i.e., items whose frequencies are higher than Q3 * 1.5 (items with frequency 1 were removed before determining quartiles).

    Reports are organized by the size of communities (from largest to smallest).

    Community reports

    Group 12 (397 members)

    The community has 397 members (7.73%). The most salient features of the community are as follows:

    • Disciplines: Political Science (149), History (54), International Relations/Affairs (27);
    • Subfields: Middle East/Near East Studies (113), Comparative (86), Democratization (74), Arab Studies (63), Political Economy (55), 19th-21st Centuries (52), Nationalism (44);
    • Geographical focus: All Middle East (100), Egypt (79), Arab States (59), Iran (45), Syria (43), Turkey (41), Gulf (37);
    • Languages of expertise: Arabic (214), French (155), English (122), Spanish (58), German (54), Turkish (48), Persian (48);
    • Members have degrees from: Georgetown U (36), Harvard U (30), U Oxford (26), UC, Los Angeles (22), U Oslo (20), U Michigan (18), Princeton U (18), American U, Cairo (16), London School Economics (16), Oxford U (16), Brown U (16), U Chicago (16), U Virginia (14), New York U (14), SOAS (12), Yale U (10), Boğaziçi U (10), U Pennsylvania (10), Stanford U (8), U Massachusetts, Amherst (8), Johns Hopkins U (8), Cairo U (8), UC, Berkeley (8), Cornell U (8), U Washington (8), Tel-Aviv U (8), American U, Beirut (6), MIT (6), U Cambridge (6), U Florida (6), U Durham (6), LSE (6), Northwestern U (6), U Edinburgh (6), Institut d’Etudes Politiques de Paris (6), Columbia U (6), Ohio State U (6), U Tübingen (6), Western Michigan U (6), McGill U (6), Ankara U (6), IEP d’Aix-en-Provence (6);
    • Most active members are: Lust, Ellen (103), Lawson, Fred H. (94), Jamal, Amaney A. (93), Stacher, Joshua (93), Brown, Nathan J. (89), Schwedler, Jillian M. (88), Heydemann, Steven (82), Albrecht, Holger (75), Anderson, Lisa (74), Patel, David Siddhartha (70), Lynch, Marc (69), Brownlee, Jason (63), Yadav, Stacey Philbrick (62), Beinin, Joel (60), Yom, Sean (59), Kamrava, Mehran (57), Pearlman, Wendy (54), Watenpaugh, Keith D. (54), Kurzman, Charles (54), Moghadam, Val (53), McDougall, James (53), Sallam, Hesham (52), Bellin, Eva (50), Bishara, Dina (50), Daoudy, Marwa (49), Utvik, Bjorn Olav (47), Altug, Seda (47), Tajali, Mona (44), Brooke, Steven T. (44), Herb, Michael (44), Vignal, Leila Marie Rebecca (42), Chalcraft, John T. (42), Schonmann, Noa (41), Nugent, Elizabeth R. (39), Hashemi, Nader (39), Benstead, Lindsay J. (39), Ciftci, Sabri (38), Haddad, Bassam (38), Fouladvand, Hengameh (38), Lucas, Russell (38), Valbjorn, Morten (37), Robson, Laura C. (37), White, Benjamin Thomas (36), Wuthrich, F. Michael (36), Ayoub, Samy (35), Gause III, F. Gregory (35), Karakoc, Ekrem (35), Bayat, Asef (34), Wedeen, Lisa (34), Blaydes, Lisa (34), Watenpaugh, Heghnar (32), Willis, Michael J. (32), Sowers, Jeannie (32), Semerdjian, Elyse (30), Volpi, Frederic (30), Moore, Pete W. (30), Gao, Eleanor (29), Goldberg, Ellis (28), Legrenzi, Matteo (28), Clarke, Killian (28), Rahal, Malika (27), Banko, Lauren (26), Vodopyanov, Anya (26), Reif, Megan E. (26).

    Group 15 (342 members)

    The community has 341 members (6.639%). The most salient features of the community are as follows:

    • Disciplines: History (145), Religious Studies/Theology (41);
    • Subfields: Islamic Studies (105), 7th-13th Centuries (102), Middle East/Near East Studies (81), Arabic (53), Medieval (48), Iranian Studies (44), Historiography (39), History of Religion (37), Gender/Women’s Studies (34), 19th-21st Centuries (33), Mediterranean Studies (32), 13th-18th Centuries (32), Comparative (26);
    • Geographical focus: All Middle East (99), Islamic World (74), Iran (66), Egypt (62), Syria (34), Iraq (30), Mediterranean Countries (29);
    • Languages of expertise: Arabic (209), French (165), German (109), Persian (104), English (74), Turkish (47), Spanish (47), Hebrew (43);
    • Members have degrees from: Princeton U (56), U Chicago (56), Harvard U (44), Columbia U (36), UC, Los Angeles (28), U Michigan (24), U Toronto (24), New York U (20), U Pennsylvania (16), Oxford U (16), American U, Beirut (16), American U, Cairo (16), Yale U (14), UC, Santa Barbara (10), U Oxford (10), Duke U (10), U Virginia (10), UC, Berkeley (10);
    • Most active members are: Hanley, Will (103), Walker, Paul E. (83), Romanov, Maxim (78), Borrut, Antoine (76), Ellis, Matthew Hal (73), Savant, Sarah Bowen (68), Donner, Fred M. (66), Khalek, Nancy (63), Humphreys, R. Stephen (53), Bulliet, Richard W. (52), Pinto, Karen C. (48), Shaindlinger, Noa (48), Nielson, Lisa (48), Pourshariati, Parvaneh (46), Hanssen, Jens-Peter (45), Judd, Steven C. (44), Hain, Kathryn (40), Hanaoka, Mimi (40), Jiwa, Shainool (38), Baker, Christine (38), Gordon, Matthew S. (38), Gaiser, Adam (38), Dabiri, Ghazzal (37), Keaney, Heather N. (36), Ghazal, Amal (34), Antrim, Zayde G. (34), Daftary, Farhad (33), Klasova, Pamela (31), Hoffman, Valerie J. (31), Weitz, Lev (31), Urban, Elizabeth (31), Vacca, Alison Marie (31), Hagler, Aaron (31), Reynolds, Dwight F. (30), Kholoussy, Hanan (30), Ulrich, Brian J. (30), Perry, Craig (29), Adem, Rodrigo (28), Ludvigsen, Barre (27), Bacharach, Jere L. (27), Bonner, Michael (27), Mourad, Suleiman A. (27), Haug, Robert (26), Noy, Avigail (26), La Porta, Sergio (26), Friedman, Rachel (25), Gomez-Rivas, Camilo (25), Catlos, Brian (24), Carlson, Thomas (24), Myrne, Pernilla (23), Monterescu, Daniel (23).

    Group 1 (339 members)

    The community has 338 members (6.581%). The most salient features of the community are as follows:

    • Disciplines: History (199), Religious Studies/Theology (20);
    • Subfields: Ottoman Studies (157), 13th-18th Centuries (93), Middle East/Near East Studies (80), Islamic Studies (61), 19th-21st Centuries (57), Turkish Studies (50), Islamic Law (40), Mediterranean Studies (34);
    • Geographical focus: Ottoman Empire (127), Turkey (74), All Middle East (64), Islamic World (53), Anatolia (42), Balkans (42);
    • Languages of expertise: Arabic (177), Turkish (163), French (156), German (105), Persian (98), English (91), Ottoman (75);
    • Members have degrees from: Harvard U (64), Princeton U (58), U Chicago (58), Boğaziçi U (34), UC, Berkeley (28), Columbia U (28), Ohio State U (22), Georgetown U (22), U Michigan (20), UC, Los Angeles (20), Bilkent U (16), U Toronto (12), Middle East Technical U (12), McGill U (12), Yale U (12), New York U (12), U Oxford (10), Sabanci U (10), Tel-Aviv U (10), Rice U (8), Stanford U (8), Swarthmore College (6), Hebrew U (6), Istanbul U (6), U Pennsylvania (6), Emory U (6), Indiana U (6), Oxford U (6), U Cambridge (6), Hebrew U Jerusalem (6), Boston U (6);
    • Most active members are: Schull, Kent F. (139), Smiley, William (88), Isom-Verhaaren, Christine (87), Curry, John (79), Mikhail, Alan (73), Gratien, Chris (71), Aksan, Virginia (68), Hathaway, Jane (66), Saracoglu, M. Safa (66), Singer, Amy (65), Low, Michael Christopher (65), Khoury, Dina Rizk (64), Darling, Linda T. (64), Genell, Aimee (54), Varlik, Nukhet (49), Barakat, Nora (45), Ayalon, Yaron (45), Cuno, Kenneth M. (44), Ginio, Eyal (44), Shefer-Mossensohn, Miri (43), Yildiz, Sara Nur (43), White, Joshua (42), Shafir, Nir (41), Karatas, Hasan (40), Philliou, Christine M. (40), Wilkins, Charles L. (39), Menguc, Murat (38), Pitts, Graham (38), Krstic, Tijana (38), Al-Tikriti, Nabil (38), Can, Lale (37), Melvin-Koushki, Matthew (37), Faroqhi, Suraiya (36), Weiss, Max (36), Griffith, Zoe (34), Pfeifer, Helen (34), Makdisi, Ussama (32), Baldwin, James E. (32), Ergene, Bogac (32), Emre, Side (31), Taylor, Malissa (31), Karakaya-Stump, Ayfer (31), Wittmann, Richard (31), Zarinebaf, Fariba (31), Babayan, Kathryn (31), Gardiner, Noah (30), Winter, Stefan (29), Artun, Tuna (29), Burak, Guy (29), Oztan, Ramazan Hakki (29), Stearns, Justin (29), Findley, Carter V. (29), Esmer, Tolga U. (29), Ashraf, Assef (29), Sen, Ahmet Tunc (28), Turkyilmaz, Zeynep (28), Ferguson, Heather (27).

    Group 8 (330 members)

    The community has 328 members (6.386%). The most salient features of the community are as follows:

    • Disciplines: History (156), Anthropology (22);
    • Subfields: Middle East/Near East Studies (99), 19th-21st Centuries (86), Colonialism (60), Nationalism (53), Ottoman Studies (45), Arab-Israeli Conflict (44), Arab Studies (44), Cultural Studies (36), Gender/Women’s Studies (36), Identity/Representation (35);
    • Geographical focus: All Middle East (72), Palestine (64), Egypt (63), Ottoman Empire (56), Syria (40), Israel (39), Turkey (36), Lebanon (35), Arab States (27);
    • Languages of expertise: Arabic (189), French (151), English (99), Turkish (67), Hebrew (59), German (56);
    • Members have degrees from: UC, Los Angeles (36), Harvard U (30), U Chicago (28), Georgetown U (26), Columbia U (26), New York U (22), American U, Cairo (22), Tel-Aviv U (18), UC, Berkeley (14), Boğaziçi U (14), U Pennsylvania (14), Princeton U (14), U Michigan (14), Stanford U (12), Oxford U (12), U Oxford (12), Cairo U (10), American U, Beirut (10), U Toronto (10), McGill U (8), Bilkent U (8), MIT (8), Yale U (6), U Manchester (6), London School Economics (6), Indiana U (6), U London (6), FU, Berlin (6);
    • Most active members are: Clancy-Smith, Julia (88), Gelvin, James L. (86), Pastor de Maria y Campos, Camila (81), Thompson, Elizabeth (78), Fahmy, Khaled (75), Halperin, Liora R. (71), Khater, Akram F. (69), Fleischmann, Ellen L. (65), Mazza, Roberto (61), Bailony, Reem (56), Falb Kalisman, Hilary (55), Barak, On (54), Aksakal, Mustafa (54), Tamari, Salim (53), Fahrenthold, Stacy (52), Mestyan, Adam (51), Wyrtzen, Jonathan (50), Robinson, Shira (49), Vitalis, Robert (49), Fortna, Benjamin Carr (48), Cohen, Julia (47), Sheehi, Stephen P. (46), Owen, Roger (46), Akin, Yigit (44), Alon, Yoav (43), Abou-Hodeib, Toufoul (42), Nassar, Issam (42), Tanielian, Melanie (42), Kozma, Liat (41), Eickelman, Dale F. (40), Robinson, Nova (39), Karamursel, Ceyda (39), Ryzova, Lucie (38), Moreau, Odile (38), Ismail, Shehab (38), Ayalon, Ami (38), Dallasheh, Leena (37), Gran, Peter (37), Hazkani, Shay (37), Campos, Michelle U. (36), Schwartz, Kathryn (36), Armanios, Febe (35), Provence, Michael (34), Lockman, Zachary (34), Minawi, Mostafa (34), Balsoy, Gulhan (33), Yanikdag, Yucel (33), Jacobson, Abigail (32), Greene, Annie (32), El-Husseiny, Momen (32), Degani, Arnon (31).

    Group 10 (211 members)

    The community has 210 members (4.089%). The most salient features of the community are as follows:

    • Disciplines: Anthropology (50), History (32), Political Science (16);
    • Subfields: Middle East/Near East Studies (52), Gender/Women’s Studies (46), Cultural Studies (30), 19th-21st Centuries (30), Identity/Representation (27), Nationalism (26), Urban Studies (23), Ethnography (23), Turkish Studies (21), Globalization (19), Arab Studies (18);
    • Geographical focus: All Middle East (55), Egypt (38), Turkey (28), Iran (26), Lebanon (17);
    • Languages of expertise: Arabic (99), French (78), English (60);
    • Members have degrees from: Boğaziçi U (40), Columbia U (18), Princeton U (16), UT, Austin (16), U Arizona (16), Stanford U (16), New York U (14), U Chicago (14), Harvard U (14), UC, Los Angeles (14), Georgetown U (14), UC, Berkeley (10), Emory U (8), U Michigan (8), Oxford U (8), Yale U (8), U Oslo (6), Middle East Technical U (6), London School Economics (6), American U, Beirut (6);
    • Most active members are: Stanton, Andrea L. (113), Joseph, Suad (82), Mills, Amy (78), Dougherty, Roberta L. (65), Mahdavi, Pardis (60), Hammad, Hanan H. (58), Katz, Kimberly B. (53), Foster, Angel M. (53), Gardner, Andrew (48), Motlagh, Amy (48), Chomiak, Laryssa (47), Inhorn, Marcia C. (45), Higgins, Annie C. (44), Shechter, Relli I. (44), Alemdaroglu, Ayca (44), Morrison, Heidi (44), Singerman, Diane (42), Salamandra, Christa (40), Wynn, Lisa L. (39), Ghannam, Farha (39), Boum, Aomar (38), Reynolds, Nancy Y. (37), Coslett, Daniel (36), Collins, Rodney WJ (34), Hashemi, Manata (34), Abul-Magd, Zeinab A. (34), Rahimieh, Nasrin (30), Sweis, Rania (30), Betteridge, Anne H. (30), Baron, Beth (27), Erami, Narges (26), Bozcali, Firat (26).

    Group 9 (206 members)

    The community has 206 members (4.011%). The most salient features of the community are as follows:

    • Disciplines: History (35), Anthropology (34);
    • Subfields: Middle East/Near East Studies (54), 19th-21st Centuries (34), Gender/Women’s Studies (32), Identity/Representation (32), Urban Studies (31), Colonialism (31), Maghreb Studies (28), Arab-Israeli Conflict (27), Development (25), Cultural Studies (25), Arab Studies (24), Nationalism (24), Diaspora/Refugee Studies (24), Comparative (22), Ethnography (22), Political Economy (21);
    • Geographical focus: All Middle East (47), Palestine (38), Lebanon (29), Egypt (29);
    • Languages of expertise: Arabic (115), French (89), English (69), Spanish (35), German (33);
    • Members have degrees from: Columbia U (30), New York U (20), Harvard U (14), Oxford U (14), UC, Los Angeles (14), UC, Berkeley (12), Georgetown U (10), Bilkent U (10), Yale U (8), American U, Beirut (8), U Pennsylvania (8), York U (8), London School Economics (8), Ghent U (6), Princeton U (6), SOAS (6), U Wisconsin, Madison (6);
    • Most active members are: Abu-Rish, Ziad M. (101), Hazbun, Waleed (83), Khalil, Osamah (70), Sbaiti, Nadya J. (67), Al-Hamarneh, Ala (63), Davis, Rochelle Anne (59), Kanna, Ahmed (58), Bsheer, Rosie (55), Atia, Mona (55), Bogaert, Koenraad (54), Parker, Christopher H. (52), Challand, Benoit (51), Rothenberg, Janell (50), Davis, Muriam Haleh (49), Zemni, Sami (48), Keshavarzian, Arang (47), Slyomovics, Susan (47), Yildiz, Murat C. (46), Sinno, Nadine (46), Cavatorta, Francesco (44), Volk, Lucia (44), Salime, Zakia (43), Bishara, Amahl (42), Hammond, Timur (41), Hermez, Sami (39), Adely, Fida (39), Meiton, Fredrik (39), Gunel, Gokce (39), Farah, May (38), Kraidy, Marwan M. (34), Khalili, Laleh (33), Farah, Randa R. (29), Altan-Olcay, Ozlem (29), Tawil Souri, Helga (29), Shirazi, Roozbeh (28), Bouziane, Malika (27), Gamblin, Sandrine (27).

    Group 11 (201 members)

    The community has 201 members (3.914%). The most salient features of the community are as follows:

    • Disciplines: Language (55);
    • Subfields: Arabic (76), Language Acquisition (45), Middle East/Near East Studies (41), Arab Studies (26), Education (25), Maghreb Studies (23), Sociolinguistics (23), Cultural Studies (22), Pedagogy (22), Identity/Representation (19);
    • Geographical focus: Egypt (52), All Middle East (48), Arab States (28), Morocco (27), Maghreb (20);
    • Languages of expertise: Arabic (127), French (81), English (71), Spanish (35);
    • Members have degrees from: American U, Cairo (32), UT, Austin (24), U Arizona (18), Brigham Young U (12), New York U (12), U Michigan (12), Georgetown U (10), U Chicago (10), Princeton U (8), Ohio State U (8), Harvard U (8), U Wisconsin, Madison (8), Ain Shams U (6), Brown U (6), U Pennsylvania (6), U Texas (6), Florida State U (6);
    • Most active members are: Idrissi Alami, Ahmed (60), Dana, Karam (40), Shiri, Sonia (39), Khannous, Touria (35), Aboel Seoud, Dalal (33), Amster, Ellen J. (33), Soliman, Iman Aziz (32), Al-Batal, Mahmoud (32), Al Khalil, Muhamed (30), Familiar, Laila (29), Brustad, Kristen (28), Marsans-Sakly, Silvia (28), Taha, Zeinab A. (27), Alhawary, Mohammad T. (27), Hirchi, Mohammed (26), Soulaimani, Dris (25), Esseesy, Mohssen (25), Essam, Rasha (24), Dardir, Ahmed (23), Basheer, Nesrine (23), Chakrani, Brahim (22), Eisele, John C. (22), El-Essawi, Raghda (22), Anishchenkova, Valerie (22), Terc, Mandy (22), Hassanein, Hanan (20), Toler, Michael A. (19), Angrist, Michele Penner (19), Yacout, Shahira (19), Smith, Sharon C (19), Stokes, Corinne (18), Chekayri, Abdellah (18), Glanville, Peter (18), Loomis, Summer (18).

    Group 24 (197 members)

    The community has 196 members (3.816%). The most salient features of the community are as follows:

    • Disciplines: History (59), Political Science (25);
    • Subfields: Ottoman Studies (49), Turkish Studies (47), Middle East/Near East Studies (46), Nationalism (40), Armenian Studies (39), 19th-21st Centuries (39), Kurdish Studies (38), Identity/Representation (34), Minorities (26), Gender/Women’s Studies (26), Diaspora/Refugee Studies (22), Cultural Studies (20);
    • Geographical focus: Turkey (79), Ottoman Empire (48), All Middle East (35);
    • Languages of expertise: Turkish (94), French (74), English (71), Arabic (64);
    • Members have degrees from: Boğaziçi U (30), UC, Los Angeles (20), Bilkent U (14), U Michigan (12), Columbia U (12), U Chicago (12), Middle East Technical U (12), Ankara U (12), Yale U (10), Princeton U (10), Harvard U (10), New York U (8), Sabanci U (8), Carleton U (8), Istanbul U (8), U Virginia (6), U Arizona (6), UC, Berkeley (6), U Toronto (6);
    • Most active members are: Gunter, Michael M. (82), Gocek, Fatma Muge (74), Der Mugrdechian, Barlow (72), Sinclair, Christian (67), Igsiz, Asli Z. (62), Cora, Yasar Tolga (56), Eccarius-Kelly, Vera (56), Klein, Janet (48), Derderian, Dzovinar (46), Ahmed, Mohammed M.A. (44), Olson, Robert W. (43), Der Matossian, Bedross (42), Ekmekcioglu, Lerna (42), Bertram, Carel (38), Hepkaner, Ilker (37), Koker, Ayse Neveser (36), Kurt, Umit (32), Entessar, Nader (31), Melkonian, Doris (28), Meyer, James Howard (27), Ulker, Erol (26).

    Group 31 (181 members)

    The community has 180 members (3.505%). The most salient features of the community are as follows:

    • Disciplines: History (78);
    • Subfields: Middle East/Near East Studies (42), 19th-21st Centuries (42), Gender/Women’s Studies (32), Colonialism (30), Ottoman Studies (27), Cultural Studies (23), World History (19), Nationalism (18), Identity/Representation (17), Comparative (17);
    • Geographical focus: All Middle East (47), Egypt (30), Turkey (22), Ottoman Empire (21), Palestine (16), Iran (15);
    • Languages of expertise: Arabic (85), French (57), English (53);
    • Members have degrees from: Georgetown U (22), New York U (18), U Arizona (12), UC, Los Angeles (12), Yale U (10), U Chicago (10), Harvard U (10), Boğaziçi U (10), Tel-Aviv U (10), Middle East Technical U (8), U Denver (8), McGill U (6), Bilkent U (6), Columbia U (6), Rutgers U (6), Ohio State U (6), UT, Austin (6), Princeton U (6);
    • Most active members are: Gordon, Joel (59), Akturk, Ahmet Serdar (57), Minkin, Shana E. (47), Khazeni, Arash (47), Haiduc-Dale, Noah (43), Dolbee, Samuel (43), Parnell, Matthew (42), Kazemi, Ranin (40), Sharkey, Heather J. (39), Ates, Sabri (36), Goffman, Laura (35), Scalenghe, Sara (33), Yousef, Hoda (32), Lattouf, Mirna (32), Hallward, Maia Carter (31), Kaler, Helena (29), Bawalsa, Nadim (29), Arabaci, Elcin (27), Pehlivan, Zozan (26), Kuehn, Thomas (26), Pollard, Lisa (26), Deguilhem, Randi C. (25), Fares, Nicole (24), Whidden, James (24), Orkaby, Asher (24).

    Group 4 (177 members)

    The community has 177 members (3.446%). The most salient features of the community are as follows:

    • Disciplines: History (44);
    • Subfields: 19th-21st Centuries (48), Middle East/Near East Studies (44), Gender/Women’s Studies (40), Colonialism (36), Islamic Studies (33), Arab Studies (24), Identity/Representation (24), Maghreb Studies (22), Cultural Studies (22), Nationalism (20), Transnationalism (20);
    • Geographical focus: All Middle East (27), Egypt (25), Palestine (24), Islamic World (23), Maghreb (22), Europe (18);
    • Languages of expertise: Arabic (114), French (88), English (44), Spanish (33);
    • Members have degrees from: Columbia U (32), Harvard U (30), Yale U (18), U Cambridge (14), Georgetown U (14), U Chicago (14), New York U (14), Oxford U (12), U Oxford (10), U Michigan (10), McGill U (8), Princeton U (8), UC, Berkeley (8), George Washington U (6), American U, Beirut (6), U Toronto (6), Ohio State U (6);
    • Most active members are: Kalmbach, Hilary (100), Esmeir, Samera (57), Starrett, Gregory (52), Jacob, Wilson Chacko (51), Farquhar, Michael (44), Dalsheim, Joyce (40), Arsan, Andrew K. (37), Hammer, Juliane (35), Hartman, Michelle (34), Tucker, Judith E. (33), Hassan, Mona F (33), Ben-Yehoyada, Naor (32), Rock-Singer, Aaron (32), Gaul, Anny (31), Shryock, Andrew J. (31), Schneider, Suzanne (30), Ernst, Carl W. (28), Bamyeh, Mohammed A. (27), Green, Nile (26), Abi-Mershed, Osama (26), Armijo, Jacqueline (25), Skalli, Loubna Hanna (25), Ferguson, Susanna (25), Scott Deuchar, Hannah (23), Lybarger, Loren (23), Driessen, Michael (21), Cornwell, Graham (21).

    Group 2 (173 members)

    The community has 173 members (3.368%). The most salient features of the community are as follows:

    • Disciplines: Literature (84), History (24);
    • Subfields: Arabic (65), Comparative (46), Arab Studies (40), Middle East/Near East Studies (39), 19th-21st Centuries (35), Gender/Women’s Studies (29), Cultural Studies (28), Colonialism (21), Identity/Representation (19), Islamic Studies (19), Translation (18), Maghreb Studies (18);
    • Geographical focus: All Middle East (45), Egypt (39);
    • Languages of expertise: Arabic (115), French (90), English (51);
    • Members have degrees from: Columbia U (32), American U, Beirut (22), Indiana U (22), UT, Austin (20), New York U (20), U Pennsylvania (18), U Chicago (14), American U, Cairo (14), Harvard U (14), Yale U (12), Princeton U (12), UC, Berkeley (10);
    • Most active members are: Al-Musawi, Muhsin J. (131), El-Ariss, Tarek (115), Stetkevych, Suzanne P. (69), Al-Samman, Hanadi (67), Salama, Mohammad (61), Nalbantian, Tsolin (58), Al-Ghadeer, Moneera (57), Halabi, Zeina G. (48), Head, Gretchen A. (45), Giordani, Angela (41), Taleghani, R. Shareah (39), Cooke, Miriam (39), El Guabli, Brahim (37), Al-Saleh, Asaad (32), Golley, Nawar Al-Hassan (32), Holt, Elizabeth (32), Hermes, Nizar F. (31), Ramadan, Yasmine (31), Saba, Elias (30), Paul, Drew (30), Sellman, Johanna (30), Powers, David S. (29).

    Group 6 (167 members)

    The community has 167 members (3.252%). The most salient features of the community are as follows:

    • Disciplines: History (28), Political Science (23);
    • Subfields: Middle East/Near East Studies (51), Islamic Studies (29), Arab Studies (24), Arabic (19), Islamic Law (18);
    • Geographical focus: All Middle East (46), Yemen (30), Egypt (25), Islamic World (25), Arab States (18), Syria (18);
    • Languages of expertise: Arabic (110), French (73), English (64), German (42);
    • Members have degrees from: U Pennsylvania (16), U Chicago (14), American U, Cairo (10), UC, Berkeley (10), New York U (10), U Notre Dame (8), U Vienna (8), Harvard U (8), Charles U in Prague (6), Yale U (6), Cairo U (6), Sapienza - U Rome (6), U Bonn (6), Stanford U (6);
    • Most active members are: Varisco, Daniel Martin (97), Carapico, Sheila (61), Schmitz, Charles P. (57), Hollenberg, David B. (49), Correa, Dale J. (45), Kaufman, Asher (42), Mahdi, Waleed (34), Hudson, Michael C. (33), Dahlgren, Susanne (30), Mahoney, Daniel (30), Steinbeiser, Stephen (29), Casey, James (28), Sika, Nadine (28), Regourd, Anne (27), Mohamed, Eid (27), Sisler, Vit (27), Adra, Najwa (26), Walbridge, John (25), Um, Nancy Ajung (25), Guenther, Sebastian (25), Atassi, Ahmad Nazir (25), Alsultany, Evelyn (24), Michalak, Laurence O. (24), Jarmakani, Amira (23), Schultz, Warren C. (23), Hennessey, Katherine (22), Cimino, Matthieu (22), Zerhouni, Saloua (22), Meier, Daniel (21), Silzell, Sharon (21), Baumann, Hannes (21).

    Group 5 (154 members)

    The community has 154 members (2.998%). The most salient features of the community are as follows:

    • Disciplines: Literature (31), Political Science (21);
    • Subfields: Middle East/Near East Studies (43), Arabic (39), Turkish Studies (35), Gender/Women’s Studies (28), Cultural Studies (27), Nationalism (24), Comparative (20), Identity/Representation (19), 19th-21st Centuries (16), Arab Studies (16), Islamic Studies (15);
    • Geographical focus: All Middle East (50), Turkey (36), Egypt (27), Iran (25), Europe (18);
    • Languages of expertise: Arabic (82), French (60);
    • Members have degrees from: Georgetown U (18), Harvard U (16), Boğaziçi U (12), Bilkent U (12), U Arizona (12), Indiana U (12), UC, Berkeley (10), American U, Cairo (10), U Pennsylvania (10), U Chicago (10), Columbia U (10), New York U (10), UC, Los Angeles (10), Emory U (8), Katholieke Universiteit Leuven (6), UT, Austin (6), Yale U (6), U Michigan (6), U Utah (6);
    • Most active members are: Colla, Elliott (59), Ramadan, Dina A. (58), Zencirci, Gizem (50), Isik, Damla (45), Kocamaner, Hikmet (43), Jorgensen, Cory (36), White, Jenny B. (35), Ali, Samer M. (34), Thompson, Thomas (34), Atanassova, Gergana (30), Shively, Kim (28), Smith, Sarah-Neel (28), Carney, Josh (26), Drumsta, Emily (25), Trentman, Emma (25), Cinar, Alev (24), Harb, Lara (23), Tanyeri-Erdemir, Tugba (23).

    Group 20 (151 members)

    The community has 151 members (2.94%). The most salient features of the community are as follows:

    • Disciplines: History (39), Political Science (22);
    • Subfields: Middle East/Near East Studies (50), Islamic Studies (41), Iranian Studies (33), 19th-21st Centuries (24), Comparative (21), Arab Studies (18);
    • Geographical focus: All Middle East (49), Iran (45), Lebanon (25), Islamic World (23), Iraq (20), Arab States (19);
    • Languages of expertise: Arabic (95), French (64), Persian (57), English (43), German (34);
    • Members have degrees from: U Chicago (16), Harvard U (16), American U, Beirut (16), McGill U (12), Yale U (12), Princeton U (10), Columbia U (10), U Michigan (8), U Oxford (8), UT, Austin (8), U Utah (6), American U, Cairo (6), Tehran U (6), Oxford U (6), Tel-Aviv U (6), UC, Los Angeles (6), U Wisconsin, Madison (6), Brown U (6), UC, Berkeley (6);
    • Most active members are: Riggs, Robert J. (87), Sluglett, Peter (80), Kuenkler, Mirjam (71), Heern, Zackery (70), El-Husseini, Rola (66), Cole, Juan (54), Leichtman, Mara (53), Browers, Michaelle L. (47), Baroudi, Sami Emile (43), Hayek, Ghenwa (43), Yazdani, Mina (41), Asatryan, Mushegh (40), Mottahedeh, Roy (38), Abidor, Pascal (38), Haider, Najam (38), Sayed, Linda (34), Barnwell, Kristi N. (32), Anthony, Sean (31), Rahimi, Babak (29), Shehadi, Nadim (29), Lob, Eric (26), El-Karanshawy, Samer (26), Arslan, Ceylan Ceyhun (26).

    Group 27 (149 members)

    The community has 149 members (2.901%). The most salient features of the community are as follows:

    • Disciplines: History (39), Literature (38);
    • Subfields: Persian (38), Iranian Studies (35), 13th-18th Centuries (29), 19th-21st Centuries (20), Gender/Women’s Studies (20), Middle East/Near East Studies (20), Arabic (17), Cultural Studies (17), Ottoman Studies (16), Islamic Studies (15), Comparative (15);
    • Geographical focus: Iran (59), All Middle East (29), Central Asia (24);
    • Languages of expertise: Arabic (72), Persian (63), French (53);
    • Members have degrees from: U Chicago (28), UT, Austin (20), Harvard U (14), New York U (12), The U Chicago (10), Oxford U (10), Yale U (10), UC, Los Angeles (10), U Michigan (8), Columbia U (8), Princeton U (8), U Oxford (6), U Arizona (6), Eotvos Lorand U (6), UC, Irvine (6), UC, Berkeley (6), Brown U (6);
    • Most active members are: Lewis, Franklin D. (47), Losensky, Paul E. (46), Atwood, Blake (44), Jabbari, Alexander (39), Kuru, Selim (35), Cross, Cameron (35), Miller, Matthew Thomas (35), Litvin, Margaret (34), Moosavi, Amir (32), Khorrami, Mohammad Mehdi (32), Hershenzon, Daniel (31), Galarreta-Aima, Diana (30), Ghanoonparvar, Mohammad R. (29), Khakpour, Arta (28), Kia, Mana (27), Green Mercado, Marya Teresa (27), Scoville, Spencer (26), Alavi, Samad J. (26), Karamustafa, Ahmet T. (26), Brookshaw, Dominic (26).

    Group 22 (142 members)

    The community has 141 members (2.745%). The most salient features of the community are as follows:

    • Disciplines: Political Science (45);
    • Subfields: Middle East/Near East Studies (37), Maghreb Studies (31), Comparative (26), Gender/Women’s Studies (24), Cultural Studies (22), Democratization (19), Islamic Studies (16), Political Economy (16);
    • Geographical focus: Maghreb (35), All Middle East (32), Turkey (19), Egypt (17), Algeria (16);
    • Languages of expertise: Arabic (75), French (64), English (41);
    • Members have degrees from: Georgetown U (12), New York U (12), Harvard U (8), Yale U (8), Oxford U (8), UT, Austin (8), Florida State U (6), Stanford U (6), McGill U (6), U Chicago (6), U Oxford (6), Bilkent U (6), SOAS (6), Syracuse U (6);
    • Most active members are: Mundy, Jacob A. (79), Lawrence, William A. (66), Parks, Robert P. (56), Zoubir, Yahia (55), Entelis, John P. (53), Turam, Berna (50), Layachi, Azzedine (50), Roberts, Hugh (40), Cutler, Brock (38), Gray, Doris H. (36), Youssef, Maro (35), Heper, Metin (32), Marks, Monica L. (29), Seferdjeli, Ryme (27), Segalla, Spencer (25), Romanet Perroux, Jean-Louis (25), Engelcke, Dörthe (25), Buehler, Matt (25), Maddy-Weitzman, Bruce (25), Gorman, Brandon (24), Naylor, Phillip (24), Sezgin, Yüksel (22), Joubin, Rebecca (21), Deubel, Tara (21), Gershovich, Moshe (19), Zvan Elliott, Katja (18), Guran, Gozde (18).

    Group 3 (140 members)

    The community has 140 members (2.726%). The most salient features of the community are as follows:

    • Disciplines: History (36), Political Science (27);
    • Subfields: Turkish Studies (59), Middle East/Near East Studies (36), Ottoman Studies (32), 19th-21st Centuries (25), Comparative (22), Gender/Women’s Studies (21), Nationalism (18), Democratization (16), Modernization (15), State Formation (15);
    • Geographical focus: Turkey (69), Ottoman Empire (31), All Middle East (28), Europe (18);
    • Languages of expertise: Turkish (71), Arabic (51), French (50), English (44);
    • Members have degrees from: Boğaziçi U (42), U Washington (18), U Chicago (16), U Michigan (8), Uludag U (8), Columbia U (8), UC, Los Angeles (8), Georgetown U (6), U Virginia (6), U Pennsylvania (6), Duke U (6), Princeton U (6), ME Tech U (6), Istanbul Technical U (6), Istanbul U (6), U Tokyo (6), New York U (6), U Utah (6);
    • Most active members are: Aslan, Senem (60), Kasaba, Resat (58), Jackson, Maureen (50), Tezcur, Gunes Murat (49), Shissler, A. Holly (46), Mecham, Quinn (42), Libal, Kathryn (42), Parslow, Joakim (40), Woodall, G. Carole (39), Gokariksel, Banu (39), Ryan, James (34), Hart, Kimberly (32), Belge, Ceren (32), Kirecci, M. Akif (31), Kezer, Zeynep (26), Secor, Anna (26), Watts, Nicole (25), Sarfati, Yusuf (25), Bakkalbasioglu, Esra (22), Karaman, Emine Rezzan (21), Ringer, Monica (21), Kirdis, Esen (21), Snyder, Alison B. (21).

    Group 16 (133 members)

    The community has 133 members (2.59%). The most salient features of the community are as follows:

    • Disciplines: Literature (23), History (20);
    • Subfields: Gender/Women’s Studies (57), Middle East/Near East Studies (33), Cultural Studies (23), Colonialism (20), 19th-21st Centuries (19), Nationalism (17), Comparative (17), Arab Studies (17), Identity/Representation (15), Maghreb Studies (14);
    • Geographical focus: All Middle East (34), Egypt (24);
    • Languages of expertise: Arabic (73), French (62), English (34);
    • Members have degrees from: Cairo U (16), Georgetown U (10), Boğaziçi U (10), Indiana U (10), UC, Los Angeles (10), U Toronto (6), Harvard U (6), American U, Cairo (6), U Amsterdam (6), Columbia U (6), Arizona State U (6), U Exeter (6), Aristotle U Thessaloniki (6);
    • Most active members are: Al-Ali, Nadje Sadig (87), Nassif, Maggie (59), Kallander, Amy (57), Fay, Mary Ann (55), Abdelmonem, Angie (54), Rizzo, Helen M. (53), Elsadda, Hoda (51), Langohr, Vickie (51), Hasso, Frances S. (46), Pratt, Nicola (45), Tadros, Mariz (42), Gana, Nouri (39), Richter-Devroe, Sophie (39), Yilmaz, Secil (37), Khoury, Nicole (36), Hale, Sondra (35), Mamelouk, Douja (33), Belnap, R. Kirk (33), Beard, Michael (32), Benson-Sokmen, Susan (31), Galan, Susana (30), Jamal, Manal A. (30).

    Group 13 (127 members)

    The community has 127 members (2.473%). The most salient features of the community are as follows:

    • Disciplines: History (35), Anthropology (14);
    • Subfields: Middle East/Near East Studies (26), 19th-21st Centuries (20), Gulf Studies (18), Islamic Studies (14), Colonialism (14);
    • Geographical focus: All Middle East (21), Gulf (18), Egypt (16), Palestine (13), Turkey (12), Islamic World (11);
    • Languages of expertise: Arabic (64), French (46), English (30);
    • Members have degrees from: Columbia U (16), Harvard U (14), UC, Los Angeles (14), Georgetown U (10), Yale U (8), Princeton U (8), UT, Austin (6), U Michigan (6);
    • Most active members are: VanDenBerg, Jeffrey A. (145), Greeley, June-Ann (105), Jones, Toby C. (75), Hafez, Sherine M. (74), Mako, Shamiran (71), Fuccaro, Nelida (62), Bishara, Fahad A. (59), Al-Nakib, Farah (58), Nakissa, Aria (57), Kim, Somy (56), Stamatopoulou-Robbins, Sophia (54), El-Kazaz, Sarah (48), Hightower, Victoria (48), Shereen Sakr, Laila (44), Cavdar, Gamze (43), Limbert, Mandana E. (43), El Hayek, Chantal (42), Barnes, Jessica E. (41), Elhaies, Karim (39), Farmer, Tessa (38), Anderson, Jedidiah (38), Dailami, Ahmed (37).

    Group 17 (124 members)

    The community has 124 members (2.414%). The most salient features of the community are as follows:

    • Disciplines: History (41);
    • Subfields: Middle East/Near East Studies (34), Turkish Studies (31), Ottoman Studies (30), 19th-21st Centuries (23), Islamic Studies (23), Nationalism (17), Iranian Studies (16), Central Asian Studies (14);
    • Geographical focus: Turkey (38), Ottoman Empire (29), All Middle East (27), Iran (20), Central Asia (15);
    • Languages of expertise: Arabic (49), Turkish (45), French (43), English (36);
    • Members have degrees from: U Chicago (18), Harvard U (18), Columbia U (12), Marmara U (12), Boğaziçi U (10), Middle East Technical U (10), U Utah (10), Bilkent U (8), Indiana U (8), Sabanci U (6), Georgetown U (6), American U, Beirut (6), New York U (6), U Arizona (6);
    • Most active members are: Yilmaz, Hale (77), Deal, Roger A. (54), Evered, Emine Ö. (53), Atamaz, Serpil (38), Beben, Daniel (33), Gross, Jo-Ann (33), Riaz, Sanaa (32), Childress, Faith J. (31), DeWeese, Devin A. (27), Kudsieh, Suha (23), Goffman, Carolyn (23), Andani, Khalil (23), Tasdelen, Esra (23), Ziad, Waleed (22), Mitchell, Jeanene (22), Adak, Sevgi (20), Metinsoy, Murat (20), Matin-Asgari, Afshin (19), Asmi, Rehenuma (18), Cook Jr., Weston F. (18), Tsacoyianis, Beverly (18), Turan, Omer (18), Hajiani, Shiraz (18).

    Group 7 (123 members)

    The community has 123 members (2.395%). The most salient features of the community are as follows:

    • Disciplines: Political Science (23);
    • Subfields: Middle East/Near East Studies (41), Gulf Studies (31), Gender/Women’s Studies (25), Development (20), Political Economy (17);
    • Geographical focus: All Middle East (33), Gulf (27), Arabian Peninsula (20), Egypt (17), Arab States (14), UAE (13);
    • Languages of expertise: Arabic (66), English (47), French (46);
    • Members have degrees from: U Chicago (22), Georgetown U (10), Princeton U (10), U Michigan (8), Oxford U (8), U Cambridge (8), Bilkent U (6), Yale U (6), Boğaziçi U (6), UC, Berkeley (6), Johns Hopkins U (6), New York U (6), Columbia U (6);
    • Most active members are: Mitchell, Jocelyn Sage (86), Pursley, Sara (66), Foley, Sean (58), Vora, Neha (54), Okruhlik, Gwenn (51), Lori, Noora (44), MacLean, Matthew (36), Derderian, Elizabeth (34), Lowi, Miriam R. (34), Brouwer, Imco (32), Koch, Natalie (32), Coates Ulrichsen, Kristian (31), Tetreault, Mary Ann Reed (31), Aghdasifar, Tahereh (31), Cahill, Richard (30), Jones, Calvert (30), Willis, John M. (29), Babar, Zahra (29), Diwan, Kristin Smith (27).

    Group 23 (113 members)

    The community has 113 members (2.2%). The most salient features of the community are as follows:

    • Disciplines: Literature (20), History (18), Political Science (15);
    • Subfields: Turkish Studies (41), Ottoman Studies (25), Middle East/Near East Studies (24), Turkish (21), Gender/Women’s Studies (21), Cultural Studies (16), 19th-21st Centuries (15), Comparative (14), Nationalism (13), Language Acquisition (12), Islamic Studies (12);
    • Geographical focus: Turkey (50), All Middle East (30), Ottoman Empire (25), Europe (15);
    • Languages of expertise: Turkish (57), Arabic (41);
    • Members have degrees from: Boğaziçi U (34), Indiana U (14), U Washington (12), U Michigan (10), UT, Austin (10), UC, Los Angeles (8), U Texas (8), Ohio State U (8), U Pennsylvania (6), Harvard U (6), Texas Tech U (6);
    • Most active members are: Micallef, Roberta (91), Gilson, Erika H. (48), Andrews, Walter G. (47), Karahan, Burcu (44), Seviner, Zeynep (41), Havlioglu, Didem (40), Okur, Jeannette E. (34), Wishnitzer, Avner (34), Hafez, Melis (34), Onder, Sylvia W. (32), VanderLippe, John M. (32), Batur, Pinar (31), Kafadar, Cemal (30), Aguirre Mandujano, Oscar (28), Carter, Sandra G. (28), Balci, Ercan (25), Toensing, Chris (24), Eissenstat, Howard (23).

    Group 14 (109 members)

    The community has 109 members (2.122%). The most salient features of the community are as follows:

    • Disciplines: Anthropology (20);
    • Subfields: Middle East/Near East Studies (24), Gender/Women’s Studies (21), Colonialism (17), Maghreb Studies (15), Political Economy (13), 19th-21st Centuries (13), Cinema/Film (13), Cultural Studies (13), Arab-Israeli Conflict (12);
    • Geographical focus: All Middle East (28), Palestine (17), Maghreb (14), Lebanon (12), Arab States (11);
    • Languages of expertise: Arabic (54), French (44), English (20), Spanish (14);
    • Members have degrees from: U Arizona (12), New York U (10), Yale U (10), U Michigan (8), UC, Los Angeles (8), U Toronto (8), Georgetown U (8), U Chicago (6), UC, Berkeley (6), U Washington (6), Boston U (6);
    • Most active members are: Yaqub, Nadia G. (79), Cammett, Melani C. (74), Clark, Janine A. (72), Amar, Paul (55), Bishop, Elizabeth (48), Amireh, Amal (39), Gasper, Michael (32), Baun, Dylan (32), Quawas, Rula (31), Schulhofer-Wohl, Jonah (28), Hoffman, Katherine E. (24), Sayej, Caroleen (23), Goodman, Jane E. (22), Rahman, Najat (21).

    Group 30 (109 members)

    The community has 109 members (2.122%). The most salient features of the community are as follows:

    • Disciplines: Anthropology (28);
    • Subfields: Middle East/Near East Studies (34), Islamic Studies (33), 19th-21st Centuries (16), Cultural Studies (16), Diaspora/Refugee Studies (14), Transnationalism (13), Ethnography (12), Islamic Thought (11);
    • Geographical focus: Egypt (23), All Middle East (20), Islamic World (14), Europe (13);
    • Languages of expertise: Arabic (60), French (46), English (35);
    • Members have degrees from: U Michigan (8), Columbia U (8), American U, Beirut (8), New York U (6), U London (6), Georgetown U (6), London School Economics (6), Sciences Po Paris (6), U Toronto (6), U Oxford (6), Harvard U (6);
    • Most active members are: Voll, John O. (67), Cesari, Jocelyne (48), Chatty, Dawn (46), Moll, Yasmin (44), Winegar, Jessica (41), Armbrust, Walter (36), Totah, Faedah (35), Naguib, Nefissa (35), Abenante, Paola (35), Vicini, Fabio (34), Schielke, Joska Samuli (30), Gabiam, Nell (29), Bergh, Sylvia (27), McLarney, Ellen (25), Ahmad, Attiya (24), Miller, W. Flagg (23), Herrera, Linda (23).

    Group 19 (105 members)

    The community has 105 members (2.044%). The most salient features of the community are as follows:

    • Disciplines: History (37);
    • Subfields: Iranian Studies (27), Middle East/Near East Studies (25), Cultural Studies (24), 19th-21st Centuries (22), Gender/Women’s Studies (20), Maghreb Studies (19), Colonialism (16), Nationalism (14), Islamic Studies (12), Cinema/Film (12);
    • Geographical focus: Iran (31), All Middle East (30), Morocco (18), Maghreb (16);
    • Languages of expertise: Arabic (61), French (52);
    • Members have degrees from: U Toronto (12), UC, Los Angeles (12), Tel-Aviv U (10), Ben Gurion U the Negev (10), U Texas (8), U Chicago (8), Boğaziçi U (8), Harvard U (8), Cornell U (6);
    • Most active members are: Sternfeld, Lior (70), Miller, Susan Gilson (60), Heckman, Alma (53), Swedenburg, Ted (47), Terem, Etty (44), Mottahedeh, Negar (39), Stenner, David (38), Calderwood, Eric (38), Marglin, Jessica M. (36), Lawrence, Adria (36), Gottreich, Emily R. (35), Tavakoli-Targhi, Mohamad (33), Schreier, Joshua (33), Amini, Soheyl (31), Ariel, Ari (30), Schroeter, Daniel J. (29), Kapchan, Deborah A. (29), Shemer, Yaron (29), Meftahi, Ida (27).

    Group 28 (100 members)

    The community has 100 members (1.947%). The most salient features of the community are as follows:

    • Disciplines: History (18);
    • Subfields: Middle East/Near East Studies (22), 19th-21st Centuries (21), Cultural Studies (19), Gender/Women’s Studies (18), Political Economy (17);
    • Geographical focus: All Middle East (24), Turkey (15), Europe (14);
    • Languages of expertise: Arabic (53), French (43), English (38);
    • Members have degrees from: McGill U (8), Lund U (8), UT, Austin (6), U Michigan (6), U Exeter (6);
    • Most active members are: Olmsted, Jennifer (54), Ruiz, Mario M. (49), Kechriotis, Vangelis (47), Stockdale, Nancy L. (36), Yousif, Bassam (33), Sayre, Edward A. (32), Powell, Eve Troutt (31), Ferguson, Michael (29), Hooglund, Eric (28), Bahramitash, Roksana (27), Pfeifer, Karen (26), Esfahani, Hadi Salehi (26), Starr, Deborah (24), Arjmand, Reza (22), Ackfeldt, Anders (21), Sorek, Tamir (21), Janson, Torsten (21).

    Group 32 (98 members)

    The community has 98 members (1.908%). The most salient features of the community are as follows:

    • Disciplines: Literature (16);
    • Subfields: Gender/Women’s Studies (22), Middle East/Near East Studies (21), Cultural Studies (16), Ethnic American Studies (15), Islamic Studies (15);
    • Geographical focus: All Middle East (21), North America (13);
    • Languages of expertise: Arabic (52), French (29), English (23);
    • Members have degrees from: U Michigan (16), Georgetown U (8), Leiden U (6), U Chicago (6), North Carolina State U (6), UC, Davis (4), Northwestern U (4), Wayne State U (4), Syracuse U (4), U Bonn (4), U Oxford (4), Harvard U (4);
    • Most active members are: Stephan, Rita (84), Vinson, Pauline Homsi (65), Cainkar, Louise A. (50), Kayyali, Randa (49), Saylor, Elizabeth (44), Ryad, Umar (37), Khalil, Mohammad H. (36), Jung, Dietrich (31), Hassan, Salah D. (30), Marzouki, Nadia (27), Rezai, Hamid (27), Hatem, Mervat (25), Afsaruddin, Asma (25).

    Group 21 (87 members)

    The community has 87 members (1.694%). The most salient features of the community are as follows:

    • Disciplines: History (19);
    • Subfields: Iranian Studies (37), Middle East/Near East Studies (25), Turkish Studies (17), Foreign Relations (13), Cultural Studies (13);
    • Geographical focus: Iran (38), Turkey (22), All Middle East (22), Europe (10), Islamic World (9), Central Asia (9);
    • Languages of expertise: Persian (39);
    • Members have degrees from: Bilkent U (16), UT, Austin (12), UC, Los Angeles (10), U Copenhagen (10), Middle East Technical U (8), Johns Hopkins U (6), U Virginia (6), Istanbul U (6);
    • Most active members are: Ehsani, Kaveh (76), Harris, Kevan (53), Schayegh, Cyrus (49), Evered, Kyle T. (49), Moruzzi, Norma Claire (42), Baghoolizadeh, Beeta (40), Helicke, James (40), Goode, James F. (40), Atalan-Helicke, Nurcan (40), Homayounvash, Mohammad (31), Kuzmanovic, Daniella (30), Mesbahi, Mohiaddin (26), Atabaki, Touraj (26), Oladi, Samaneh (26), Koyagi, Mikiya (24).

    Group 25 (73 members)

    The community has 72 members (1.402%). The most salient features of the community are as follows:

    • Disciplines: Political Science (20);
    • Subfields: Middle East/Near East Studies (20), Arab-Israeli Conflict (18), Comparative (13), 19th-21st Centuries (12);
    • Geographical focus: Israel (15), Palestine (14), All Middle East (14);
    • Languages of expertise: Arabic (38);
    • Members have degrees from: Sabanci U (12), Tel-Aviv U (10), UC, Los Angeles (10), Georgetown U (8), Yale U (8), Columbia U (6), London School Economics (6), George Washington U (6), Boğaziçi U (6);
    • Most active members are: Freedman, Robert O. (75), Zisser, Eyal (47), Peleg, Ilan (40), Rabi, Uzi (35), Daadaoui, Mohamed (34), Nikpour, Golnar (33), Teitelbaum, Joshua (30), Ginat, Rami (28), Daoud, Suheir Abu Oksa (24), Schorn, Timothy (22).

    Group 26 (68 members)

    The community has 68 members (1.324%). The most salient features of the community are as follows:

    • Disciplines: History (17), Anthropology (9);
    • Subfields: Middle East/Near East Studies (18), Assyrian Studies (12), 19th-21st Centuries (8), Minorities (8);
    • Geographical focus: Iraq (23), Syria (11), Turkey (10), Iran (10), All Middle East (10);
    • Languages of expertise: Arabic (29), French (17);
    • Members have degrees from: Harvard U (8), U Toronto (6), U Chicago (6), Tel-Aviv U (6), U Washington (6);
    • Most active members are: Bashkin, Orit (85), Dawood, Fadi (65), Benjamen, Alda (51), Sassoon, Joseph (49), Donabed, Sargon (46), Bet-Shlimon, Arbella (37), Al-Jeloo, Nicholas (36), Guarasci, Bridget (32), Shields, Sarah D. (32), Saleh, Zainab (31).

    Group 29 (62 members)

    The community has 62 members (1.207%). The most salient features of the community are as follows:

    • Disciplines: History (15);
    • Subfields: Middle East/Near East Studies (16), Colonialism (16), 19th-21st Centuries (14), Arab-Israeli Conflict (13), Nationalism (11);
    • Geographical focus: Palestine (19), Egypt (13), All Middle East (10), Israel (8);
    • Languages of expertise: Arabic (38), French (25);
    • Members have degrees from: New York U (22), Georgetown U (10), Harvard U (8), Oxford U (8), Princeton U (6), UC, Berkeley (6);
    • Most active members are: Seikaly, Sherene (107), Williams, Elizabeth (73), Hajjar, Lisa (63), Jakes, Aaron G. (60), Ragab, Ahmed (51), Derr, Jennifer (47), Brand, Laurie (47), Bali, Asli (37).

    Group 18 (57 members)

    The community has 57 members (1.11%). The most salient features of the community are as follows:

    • Disciplines: ();
    • Subfields: Middle East/Near East Studies (17), Arab Studies (13), 19th-21st Centuries (11), Cultural Studies (11), Identity/Representation (11), Iranian Studies (8), Arab-Israeli Conflict (8);
    • Geographical focus: Palestine (14), Lebanon (11), All Middle East (10);
    • Languages of expertise: Arabic (37), French (28);
    • Members have degrees from: Indiana U (12), U Michigan (6), U Chicago (6), New York U (6), U Wisconsin, Madison (6);
    • Most active members are: Shabout, Nada M. (46), Scheid, Kirsten (40), Al-Bahloly, Saleem (34), Lenssen, Anneka (26), Strohm, Kiven (25), Marks, Laura (21).