Zer, zabar, pesh: The case for teaching diacritics to improve Urdu reading
It is 2pm, and a seven-year-old boy sits in an after-school Urdu remedial class, his fingers tracing the unfamiliar curves of the words before him. The sentence on the page seems simple enough: “Yeh Bano aur Mani kay nana hain.” He takes a breath and tries:
“Yeh Banoo aur Mani…”
The teacher gently corrects him. “Yeh Bano aur Mani…”
He pauses, puzzled. Why did the Urdu letter wow in Bano sound like “o” in go? He gathers his confidence and moves to the next line:
“Nana bazaar say aro laye hain.”
This time, he applies the same rule, reading aroo as aro, just like Bano. The teacher shakes her head. “It’s aroo.”
Confusion deepens in his eyes. Why did the same letter wow sound like an “o” in Bano but an “oo” (as in moon) in aroo? His teacher hesitates, offering a response that countless Urdu learners have heard before:
“That’s just how it is.”
The invisible barrier in Urdu literacy
This scene unfolds in classrooms across Pakistan every day. The struggle isn’t about intelligence or effort — it’s about a missing piece in how the Urdu language is taught. The confusion faced by children, like the one in the anecdote above, often arises from the lack of explicit and systematic phonics instruction.
Two distinct vowel sounds — for example, the “o” in go and the “oo” in moon — may remain indistinguishable to a young learner without the right approach to reading instruction. From a research lens, this aligns with findings from the Science of Reading, a body of research that emphasises the importance of phonemic awareness — the ability to hear, identify, and manipulate the individual sounds (phonemes) within spoken words — and phonics as foundational for reading skills. Just as evidence-based methods are now informing and transforming early literacy in English, so too should they guide Urdu reading instruction.
While the core factors influencing reading skill development are generally similar across alphabetic languages, their impact can vary significantly depending on how closely the letters in a language correspond to their sounds. In languages with transparent orthographies (the rules and conventions of how a language is written, including spelling, punctuation, capitalisation, and the use of symbols), like Finnish or Spanish, the connection is direct and predictable.
Urdu, by contrast, presents a deeper orthographic challenge: letters and sounds do not consistently align, making decoding more cognitively demanding for early readers. Leading reading researchers such as Linnea Ehri or David Kilpatrick have noted that deep orthographies place an added burden on working memory and phonological processing, requiring instruction that goes beyond rote memory or guessing strategies.
Yet, Urdu instruction continues to overlook the foundational principles that make reading more accessible in complex writing systems.
What makes Urdu harder to read than it should be
Urdu is a complex and rich language, with 44 consonants, eight long oral vowels, seven long nasal vowels, three short vowels, and numerous diphthongs. While consonant sounds are usually taught through consistent sound-symbol cues and familiar examples — a practice that supports phonetic decoding — vowels are often left by the wayside, with far less systematic instruction.
In most Urdu curricula and published textbooks, only three long vowel sounds — like the “aa” in baat, the “o” in dost, and the “ee” in cheez — are introduced in the early stages. Few teaching materials or classroom practices introduce the remaining 15 or so vowel sounds later, and by that time, many students have already adapted to reading Urdu through contextual guessing rather than phonetic decoding — a practice that weakens their long-term literacy skills.
This makes the need for explicit phonics instruction in Urdu even more critical, particularly in teaching diacritic (a sign indicating a difference in pronunciation) vowel sounds. Urdu’s script includes diacritics, known as zabar ( َ), zer ( ِ), and pesh ( ُ), to represent short vowels. However, in most printed texts and early reading materials, these vowel markers are often omitted. Without systematic instruction in recognising and applying diacritics, children are left to guess at the correct pronunciation.
According to Annual Status of Education Report (ASER), over 55 per cent of Grade 5 students in rural areas cannot read a Grade 2 level Urdu text in Pakistan. Furthermore, over 40pc of Grade 3 students cannot even identify individual letters. These figures point towards a systemic issue in the teaching of language which is rooted not in student ability but rather in instructional design.
Taking the earlier anecdote into consideration, the letter “و” in Urdu can represent multiple vowel and consonant sounds. In the word Bano (بانو), it sounds like the “o” in go. In aroo (آرو), it shifts to the “oo” in too. Yet when it appears at the beginning of a word like vehshat (وحشت), it takes on the consonant sound — the “w” in wall.
For young readers trying to apply letter-sound patterns, these inconsistencies can be frustratingly difficult to navigate. This example is just one among many. Without a strong phonics foundation, students are left to guess at and memorise words, hindering their confidence and fluency. A structured approach to phonics, tailored to Urdu’s unique characteristics, can empower children to decode with greater ease and understanding.
Comparison with Arabic and why Urdu needs its own approach
Interestingly, Arabic was originally written without diacritics to continue the oral traditions of interpreting words through context. However, as Islam spread across the Arabian Peninsula and the need for preserving the exact pronunciation of the Quran surfaced, the diacritics were brought into religious and formal texts.
Since Urdu did not have the same religious impetus, it continued omitting the diacritics for most general texts. Some educators argue that omitting diacritics encourages contextual reading or reflects natural adult reading habits. This assumption fails to account for the cognitive demands placed on early readers who lack sufficient phonemic mastery to rely on context. For beginning readers, especially in early years or in multilingual environments, these omissions become cognitive roadblocks rather than bridges.
Arabic, having a morphological structure strongly based in root words, can afford to skip diacritics and still allow readers to use morphology to guess the correct pronunciation. Arabic readers can often infer meaning from familiar roots even without vowels, but Urdu readers do not have this advantage, as noted by linguistic experts.
Urdu’s reliance on phonetic differences for meaning means that even small vowel changes can drastically alter a word’s interpretation. The word ملی (mili — met) vs ملی (milli — national) are hard to differentiate without the diacritic tashdeed which signals gemination of the consonant ‘l’ to create meaning. Phonics instruction that highlights these differences is essential for building accuracy and comprehension.
Beyond the absence of diacritic instruction, Urdu readers face additional challenges that compound reading difficulties. These challenges are rooted in the script itself, its positional letter forms, and the limitations of digital text rendering. Unlike Latin scripts, many Urdu letters change their shapes depending on their position within a word (beginning, middle, or end).
For example, ب (be), ت (te), and ث (se) have similar forms that can confuse young readers. Without systematic exposure to these variations, children struggle to recognise words accurately. Additionally, Urdu’s traditional Nastaliq script is ornate and highly cursive. Script analysis points out that while this adds to its aesthetic appeal, it complicates decoding for young readers. Letters flow into one another, creating intricate, variable shapes. The lack of standardised digital fonts further exacerbates this challenge, with broken or incomplete text making learning even harder.
It is widely recognised that many digital platforms offer poor Urdu text rendering. Unlike Arabic and Persian, which benefit from extensive digital font support, Urdu often faces incomplete ligatures and missing diacritics, making reading on screens even more difficult. Addressing these challenges through thoughtful script design, technological improvements, and classroom adaptations can complement phonics instruction. A comprehensive approach to Urdu literacy would ensure that young readers have the support they need to develop confidence and proficiency.
A way forward — teaching for transparency
Although the problems are many, some solutions are still within reach. A slight shift in teaching practices, and a more supportive educational ecosystem can add much to the teaching of Urdu as a language. Textbook revisions, child-friendly fonts, and engaging children’s literature for all age groups are essential.
Most urgent is the need to embed phonics and diacritic-based reading strategies into teacher preparation programs and national curricula. Pre-service and in-service teacher training modules must include structured literacy approaches tailored to Urdu’s orthography. Without teacher capacity, reforms at the textbook level may not translate into classroom practice.
Urdu’s classification as a deep orthography is primarily due to its frequent use of ‘unvowelised’ text. However, preliminary research in this area suggests that when vowel sounds are systematically taught through phonics, Urdu’s reading challenges can be significantly reduced. By using diacritics consistently during early instruction, Urdu effectively becomes a shallow orthography — one where the connection between letters and sounds is more transparent.
This shift makes Urdu more accessible not only for young learners but also for individuals with dyslexia, non-native speakers, and those learning Urdu as a second language. Just as phonics has proven effective in teaching English and other alphabetic languages, the systematic teaching of diacritics can bridge the gap in Urdu literacy.
It is 2pm again, but this time, the boy is eagerly flipping through his favourite Urdu storybook. The same unfamiliar curves that once puzzled him now feel like old friends. With each sentence he reads, his confidence grows — because this time, he knows why ‘بانو’ sounds like Bano and ‘آرو’ like aroo. This transformation is what a robust phonics-based Urdu instruction can achieve.
Every child deserves that moment of joy in reading, and with thoughtful reforms in Urdu instruction, countless others will experience it too. For this to happen, collaboration is needed across textbook boards, teacher training institutes, software developers, and education ministries. A national phonics framework for Urdu — especially one that incorporates diacritics — can serve as a powerful equaliser in literacy instruction, particularly for underserved and multilingual learners.
Header image by Mahrukh Mansoor/ Dawn File