Wednesday, January 30, 2013

Brief Guide To The Development Of The Arabic Script

This post is intended as a very brief guide to the development of the modern Arabic script and derived scripts (Persian /Urdu/Sindhi/Balochi/Afghani/Turkic and their friends). The history of the development of the Arabic script proper is to an extent a history of Quranic orthography ie the way the Quran is written out in the Arabic script. I have tried to steer clear of any historical and hagiographical controversies and presented only the bare minimum of information needed to get a clear grasp of the journey of the Arabic script from its embryonic stage to maturity. I hope this will serve as an introduction to my next post which will be on the different styles of Arabic calligraphy.

Super Short History of Arabic's Ancestors
Arabic is a Semitic language and all Semitic scripts (not languages) are based on the proto-Canaanite script. It is agreed that Proto-Canaanite is a child of Egyptian Hieroglyphs  with a hint of Akkadian DNA (completely different from Hieroglyphs). The pictographic Egyptian Hieroglyphs become proto-Canaanite Acrophones, symbols in which the symbol represents only the first sound of the word depicted by the same symbol in Egyptian Hieroglyphs rather than the whole word itself, eg. the hut Hieroglyph symbol depicted the word "beyt" (house) in totality but actually just stood for only the "b" sound of "beyt" in the Acrophonic proto-Canaanite script.

In the Semitic language, tree over time, the Proto-Canaanite script mothered the Phoenician script which begot the Aramaic script which birthed the Nabataean script which bore the Arabic script. It is very important to note that the Nabataean script used only 22 consonants and early Arabic had to make do with these 22 symbols for its own repertoire of 28 consonants. This 22 vs 28 difference will be significant later. Another feature of almost all of these Semitic scripts was that they did not depicts vowels (long or short), something of a family trait because the Egyptian Hieroglyphs omit vowels too; this also should be kept in mind for later on.
This would also be a good place to mention that the Greek, Roman and many Indian scripts (and their spawns) are also derivatives of the proto-Canaanite script. So Arabic and Devanagari are distant cousins and most of the things mentioned so far apply to the Indic/Greek/Roman scripts as well.


Earliest Arabic Script
Now let's come to the Arabic script itself. As of now, the first definitive example of what can be called as Arabic script is a rock engraving, an epitaph of a certain Mrs Raqush, found in Mada'in Saleh (Saudi Arabia) dated to about 267AD. Some scholars believe this script to be something in between Nabataean and Arabic and others unequivocally classify it as Arabic script. This inscription does have some words in the Thamud scripts as well. However do not assume that this is the oldest inscription in Arabic language. Many, much older, Arabic language samples have been found, albeit written/engraved in non-Arabic scripts such as pre-Islamic Arabic poetry written in Nabataean, Aramaic, Thamudic, Epigraphic South Arabian scripts etc.

The famous Raqush inscription to the left and modern Arabic copy/interpretation to the right. Can you make out any of the words in the original? Without the aid of the modern copy I can barely recognize the odd  عhere and the odd ل  there and one  حand a  فيsomewhere and that's all conjecture too. The second word in the second line is Raqoosh. Of course I have forgotten most of my Arabic anyway. But this must have been perfectly legible to the people it was meant for.

For a partial list of Nabataean and the very early Arabic script samples check out

Revelation Era
Jumping forward three and a half centuries after the demise of Mrs Raqush we come to the era of the Quranic revelations (610-632AD). By this time the Arabic script had been modified a lot more from the semi-Nabataen form and had come much closer to its final form. The initial Meccan utterances in the Quraishi dialect of Arabic by Prophet Muhammad were shorter and quickly committed to memory by the small but fast expanding group of Muslims. However by the time the Prophet emigrated to Medina the revealed verses became much longer as did the size of the Muslim community. Now secretaries started recording these longer Arabic verses on whatever medium was ready at hand at the moment of revelation, be it animal hide, parchment, rocks, leaves or bones etc. These written records were created purely as memory aids and not as written scripture. According to Kees Versteegh, this shift from an purely oral Meccan record of the divine words to a partially written Medinan record is attested to in the Quran itself, through the shift in the usage of the word Quran (recite this), referring to the sacred revelations in the earlier verses to the word Kitaab (book), referring to the sacred revelations in the later verses. However the key thing to remember here is that despite the growing importance of a written record, for the Quranic verses as well as the pre and post Islamic Arabic poetry these written records were still secondary to the primary method of preserving something which was to memorize it (except in the case of commercial transactions and war treaties). This tradition of oral recitation and transmission is quite well entrenched in most Semitic cultures and religions so much so that to this day those who commit the Quran to memory are bestowed the title of "Haafiz" which means the preserver/protector, one who preserves the sacred text in his/her heart. Hence even till the a few years after the death of Prophet Muhammad (632AD) the written records were considered secondary as there were thousands of "reciters" who knew the Quran by heart and had learnt it from the Prophet's own mouth.

Supposed letter dictated by Prophet Muhammad to a scribe and then dispatched to the Byzantine Emperor Heraclius, circa 630AD. Regardless of the authenticity, if this text is indeed from around 630-650AD it clearly shows the strong departure of the Arabic script from its Nabataean sandbox-days as depicted in the Raqush engraving. I can easily make out quite a few letters (and words!) of this text.

Although the scripts used in the 5th-8th Centuries were very different from the one given below, this table gives an idea of what the shapes of the different Arabic letters were at this time, which sounds they represented and how common shapes were used for very different sounds. One can see that there are no dots, diacritical marks, above any of the letters.

As noted in the table given above:
The sounds b/t/th were represented by the same symbol.
The sounds j/H/kh were represented by the same symbol.
The sounds d/dh were represented by the same symbol.
The sounds s/sh were represented by the same symbol.
The sounds ṣ / ḓ were represented by the same symbol.
The sounds ṭ / ẓ were represented by the same symbol.
The sounds r/z were represented by the same symbol.
The sounds `/gh were represented by the same symbol.

Early Caliphate Era - The Rashidun
In less than 15 years of the death of the Prophet certain developments compelled his successors, the Caliphs, to take make changes in the written Quran and the Arabic script. First, many of the reciters died in battles against the apostates, the Romans and the Persians. A famous, oft quoted, example is of the half a thousand reciters who died at the Battle of Yamama in  632AD; an event which so perturbed the pious Uthman that he convinced the first Caliph, Abu Bakr, to overcome "the loss of much of the Quran" by having it compiled into a book. Second, the increasing number of non-Arab converts to Islam, who were new to the Arabic language and sounds, often incorrectly recited the Quranic verses. Finally, many of the Muslims started to disagree amongst each other on the pronunciation and meaning of some words as the Prophet had clearly declared that there were seven different, perfectly equal, readings of the Quran, based upon the different urban and Bedouin dialects of Arabic in his time. When Uthman became the third Caliph (644-652AD) he decided to bring an end to the worry of "forgetting the Quran" and also to the conflicts caused by the variant readings by undertaking a codification of the Quran. He collected all the written sheets of Quran from the Prophet's time, had them collated them into one definitive edition and then returned the sheets to Hafsa, the widow of the Prophet from whom he had taken them in the first place . For some reasons there are no extant samples of the original written records of the Quranic revelations made by the secretaries of Prophet Muhammad. This "final" version was sent to every province of the geometrically expanding Islamic empire as the authorized Quran and all non-compliant written variants were destroyed by state officials. Some variants were concealed but ultimately lost to the hands of man or of time.

Umayyad and Very Early Abbasid Caliphate Eras
However, soon two characteristics which the Arabic script had inherited from Nabataean and had not caused any problems before now returned to haunt the Arabic script's efficacy in a vast and diverse empire. Quranic orthography still employed 22 symbols to depict its 28 consonant sounds and it did still did not depict vowels in writing. This negated any real codification and unification efforts which the Caliph Umar had hoped to achieve with his authoritative final version of the Quran.

1.The first characteristic of using 22 symbols to depict 28 sounds caused a problem in identifying the correct letters. The examples below illustrates the problem. Without diacritical points to identify which of the phonemes (sounds) is being referred to, only reference to context or external guidance can help shed some light on the correct word which is implied by the author of the text.

The problems caused by misreading of Bs for Ts and Rs for Zs and so forth had reached an inflection point and something had to be done to correct the situation. Some accounts would have us believe that under the aegis of the Umayyad governor Hajjaj ibn Yusuf  (d.714AD), the diacritical points were innovated and adopted for use in the Quranic texts in order to remove the ambiguity in reading. However the actual historical evidence proves these accounts are largely apocryphal. As of now, the oldest usage of such diacritical points has been found on a papyrus called Perf No. 558, a billingual (Greek and Arabic) advance tax receipt which dates itself to 643 AD, a decade an a half before Hajjaj ibn Yusuf was even born. The Arabic text in this tax receipt has some letters dotted and others undotted and the dots appear to have been used in a very matter of fact way. Although Perf No. 558 has not been studied extensively, it is clear that at least 20 years after the Hejira of the Prophet, if not earlier, non-religious Arabic texts occasionally employed diacritical points to eliminate faulty reading of the text.

 Perf No 558, the oldest Arabic text which clearly shows the use of diacritical points, dated to 643AD. Source


Based on the evidence of Perf No. 558, it can be stated that Arabic script did have diacritical points used as a tool to proper understanding of the text. The Arabic letters with the diacritical points to differentiate them from each other would have looked almost exactly like the ones used today, as shown in the table given below. The dots help, as shown in the mountain/dementia/rope example above to read  However mere availability is not the same as active usage and we know for a fact that the Arabic Qurans did not employ the diacritical points, perhaps largely to avoid any inadvertent desecration of the base text. The arrangement in the table below was made by Arabic grammarians on the basis of similarity in the shapes of the letters. More on arrangements later.

 The final Arabic alphabet. Compare with the first table above which gives the same number of sounds but with fewer letters

 2.The second characteristic of the Arabic script of not marking vowels also caused confusions, especially between verb forms which often have the same shape and letters but different short vowels and sometimes between plurals and verb forms. The example below illustrates the latter confusion.

 This vowel problem was initially overcome by the pioneer grammarian Abul Aswad AdDuali (d.688AD) at the behest of the Umayyad Caliph AbdUl Malik (d.705AD), who was also instrumental in switching the administrative language of the entire Arabian empire from a patchwork of Greek, Aramaic and Pahlavi over to Arabic after he caught a Greek scribe urinating into the ink well used to write out the official records for lack of water to prepare the ink. The solution proposed by the grammarian Abul Aswad seems to have been partially inspired by similar solutions in other Semitic script traditions: place dots around each letter to indicate short vowel sounds for that letter. Abul Aswad is also credited with inventing the symbols for the Hamza and the Khafeef vowels and the Shadda. Before, the Khafeef (absence of any vowel) and the Shadda (doubling of a consonant) were not depicted at all, hence the Khafeef and the Shadda too had to be inferred from the context of the base text. This system of Abul Aswad was further refined by the 8th Century grammarian and author of the first Arabic dictionary, Al Khalil ibn AhmedFaraaheedi (d.791AD) who replaced the dots with smaller versions of the corresponding long vowel sounds. This has been illustrated in the table below. 

Vowel Name and Sound
Abul Aswad Ad-Duali's (d.688AD) Vowel Markers
Al Khalil ibn Ahmed's (d. 791AD) Vowel Markers
Fatha - Short "a" - "Ma"
Dhamma - Short "u" - "Mu"
Kasra - Short "i" - "Mi"
Tanwin - Short Nunation - "Dan, Dun, Din"
 ڍ , د.. , ڌ
 دٍ , دٌ,داً
Shadda - doubling of a consonant sound
Symbol unknown to me
Khafeef aka Sukoon - absence of any vowel sound
Symbol unknown to me
ْ , ۡ
Hamza - glottal stop
ء , ؤ , ئ , أ , إ

Middle Abbasid Caliphate Era onwards
These two changes were not accepted immediately by the religious members of the Muslim community largely as a result of fear of innovating the received text of the Quran. Similar fears also dissuaded the Jews from accepting any diacritical points or matres lectionis to identify letters or vowels over the base text of the standardized Hebrew Bible until well over a thousand years after the composition of the last book of the Hebrew Bible. It took close to 250-300 years after the revelation of the Quran for the vowel markers and diacritical points to become a common place feature in Qurans. Interestingly the Jewish initiative, called the Masoretic text of the Hebrew Bible, and the Muslim initiative for making these changes in the Quran were both finalised around the same time, being almost contemporaneous events, within 50-75 years of each other. Further both of these sacred texts with the diacritical marks and vowel points are now the standard texts for their respective religions (though only unmarked, base text Hebrew Bibles are used for liturgy).

 Very early Kufic Arabic Quranic calligraphy from Yemen.
Shows only the base text.
No vowel markers - No diacritical points to distinguish between phonemes

 Kufic Arabic calligraphy, Surah Hujjarat, 9th Century. Text on the obverse side is visible due of inadequate thickness of the parchment.
Shows base text. Coloured vowel markers added later in Abul Aswad's style over base text in an effort to standardize the sounds.
No diacritical points to distinguish between phonemes.


 Later Kufic Arabic Quranic calligraphy, perhaps 10th Century.
Shows base text, vowel markers in Abul Aswad's style and diacritical points to distinguish between phonemes.

 Naskh Arabic Quranic calligraphy, Surah Fatiha, representative of all Qurans post-11th Century.
Shows base text, vowel markers in the newer Al Khalil style and diacritical points all made as part of the writing at the same time.

However by the turn of 10th-11th Century Qurans employed both diacritical points and vowel markers and from then have been mandatorily written with diacritical points and vowel marks. By this time non-Quranic Arabic texts also used the diacritical points as standard usage, though they did not use vowel markers. In non-sacred and non-school texts, the ancient Semitic habit of not marking vowels has managed to keep its hold till today. As a result only the diacritical points are marked in the majority of Arabic texts and the short vowels are left unmarked, to be guessed by reference to context. Even religious commentaries on the Quran and Hadith do not carry the short vowel markers. All other scripts based on Arabic such as Persian/Urdu/Turkic have also continued with this same tradition. Though in certain rare cases vowels are marked to clear ambiguity. The following examples will make the partial usage of vowel marks more clear.

A textbook to teach Arabic from the 1950s, employs vowel markers in every word to remove ambiguity.

 Modern printed version of the first page of the celebrated Introduction or "Muqadimah" of Ibn Khaldun's 14th century Arabic treatise on history, politics, economics and sociology. Again, barely discernable use of vowel markers

 Modern printed version of the first page of the Persian translation of Mevlana Rumi's "Fi Hi Ma Fi Hi" ( In It Is What Is In It). Notice that although most of the sentences do not have vowel marks, some sentences do. These sentences are verses from the Quran which must always be written with vowel marks.

First page of the famous turn of 20th Century Urdu novel, Umrao Jaan Ada, bereft of vowel markers but for the short vowel u in the name Umrao

The Arabic Alphabet

The modern standard Arabic alphabet arranged according to similarilty in shapes of the letters

 Note that many letters have different slightly different stand-alone, initial, medial and final forms. This feature is common to many Semitic scripts and seems to be an ancient feature of these scripts.

Gematrical Values and Other Arrangement of the Alphabet
Along with the shapes and sounds of the Arabic letters, the numerical values of these letters are also fundamental. Many Semitic and non-Semitic alphabets assign numerological values to their letters.. Often in earlier times, the letters were used as numbers based on their numerical values in lieu of any special number symbols, until Hindu numerals (to the Arabs) were adopted by the Abbasid Caliphs in early 9th Century and then later on adopted by most of the West as Arab numerals. Hence the letter أ  was used for the value 1, the letter   ب was used for the value 2,   تfor 3, ج   for 4 and so on; these first four  letters, A-B-J-D, which correspond to values, 1-2-3-4 were together called the abjd and gave rise to the term abjd for the Arabic alphabet. The numerical arrangement of the Arabic alphabet is given below. Other Semitic scripts also follow this same numerical system. The numerical values of letters are used for various purposes such as religious symbolism, magic and divination, astrology and occult and even for seeking divine patterns. The table below shows the numerical values and arrangements of the letters.

 The arrangement of Arabic letters into numbers is called Taarikh  تاريخor Chronogram and the most famous chronogram in Arabic is undoubtedly the number 786 which is derived as follows:
بسم لله - bism illah - 2+60+40   +   1+30+30+5 = 168
الرّحمن - a(l) rrahman - 1+30+200+8+40+50 = 329
الرّحيم - a(l) rraheem - 1+30+200+8+10+40 = 289

168+329+289 = 786

The title of one of my favourite books, Bagh - o - Bahar is also a chronogram which gives the value 1217AH corresponding to 1802AD, the year in which the book was written. Chronograms have been used for thousands of years and can sneak up on you quite suddenly, which is why they are so much fun!

We have already noted two arrangements of the Arabic alphabet: the one arranged according to similarity in the shapes of the letters and the other based on the numerical values of the letters. A third arrangement of the Arabic alphabet was created by the grammarian Al Khalil ibn Ahmed Faraaheedi (d.791AD), whom we have already encountered  as the one who perfected the vowel marker system. Al Khalil Faraaheedi wrote the first dictionary of Arabic, Kitaab al Ayn, in which he arranged the letters neither according to their shapes nor according to their values but according to where their sound originates in the mouth. His dictionary starts with Ayn  عas the first letter because it is voiced from the lowest point in the throat and moving upwards and outwards in the mouth, ends with Meem  مas the last letter because it is voiced from the tip of the lips. Because the first letter in this dictionary is Ayn it is called Kitaab al Ayn.

These three arrangements of the Arabic alphabet are not exhaustive.

Persian, Urdu, Turkic, Malay and Allied Scripts
Persian, Urdu, Turkic, Malay use the Arabic script for their own sounds by mapping the Arabic letters onto similar sounds from their alphabet. However in the case of each of these languages their sound space is much larger than the one catered to by the 28 Arabic letters and this has neccesitated the innovation of new shapes from with the 28 letter repertoire in each of these languages.

Additional letters can be spotted in the alphabet tables given below:
 Persian / Daari alphabet

Urdu alphabet

Ottoman Turkish alphabet

Sindhi alphabet

Jawi / Malay alphabet

 I hope this post has helped you to understand the basics of the development of the Arabic script and will point you in the right direction.

 To Know More About:

Arabic script and learning to write Arabic
Wikipedia's page on the history of the Arabic script
Guidedways's has a decent introductary course to learn the script
Youtube video tutorial

Numerology/Gematrical values
One of my favourite Urdu luminaries, Prof Frances Pritchett, talks about Chronograms
Wikipedia on Arabic gematrical values
Wikipedia on Hebrew gematrical values

Writing Systems, Egyptian Hieroglyphs, Proto-Cannanite
Writing Systems, A Linguistic Approach by Henry Rogers
Writing Sytems of the World by Florian Coulmas

The phenomenal work by Kees Versteegh on the development of the Arabic language
Alan Jones talks about the significance of papyrus Perf No 558
Wikipedia on the history of the Quran, including its compilation

