PESHAWAR: Hanifur Rahman, a noted research scholar, is presently working on collecting data sets for making Pashto language of artificial intelligence as the software requires more than 100 hours clips recording by end of ongoing year.

He said that Pashto Decentralised Autonomous Organisation was set up to build an autonomous, transparent and inclusive community for the adoption of Pashto as perfect tool on computer and AI.

He said that motivation behind the project was around 60 million Pashto speaking people across the globe with rich linguistic and cultural diversity but Pashto being with low availability of digital content online.

Prof Farkhanda Liaquat, the director of Pashto Academy, University of Peshawar, termed the project a revolutionary step towards digitisation of Pashto lingo corpus. She stated that the project would open a new window to the integrated knowledge and research regarding Pashto and Pashtuns at large.

Academy director terms the project a revolutionary step

Mr Rehman, when contacted by this scribe, said that unfortunately Pashto was not among 100 languages being used as an effective tool to access information but the initiative would soon adopt Pashto as computer and AI lingo. He said that efforts were being to build a Pashto digital assistant like Apple Siri or Hey Google as the app would enable Pashto speaking community worldwide to communicate with their mobile/computers via sound commands as well as written text.

He said that the main objective of the initiative was to enable Pashto speaking people across the globe to have an easy access to any kind of information in Pashto.

Mr Rahman stated that the purpose of the organisation was to create a worldwide community of Pashto linguists, journalists, teachers, translators, performers, artists, literati and hi-tech specialists to contribute to the scholastic cause.

“For the software to work for all Pashtuns including young, women and men speaking any particular accent of Pashto as many as possible, we need sample voices to bring Pashto to one of the most significant AI dataset project. Pashtun users may go to the link https://commonvoice.mozailla.org/ps and record at least 10 Pashto sentences in her/his own voice each day for a week or as long one can,” he said.

Mr Rahman said that Pashto Automatic Speech Recognition System had already been developed by his team. However, he regretted that due to lack usable voice dataset, it made a few errors and it could only be enhanced with further data base. “For an open-source ChatGPT like model named ‘Aya’ once released one could talk/chat in Pashto with that large model AI,” he explained.

He said that he wanted completion of translation of common voice portal to Pashto to promote it from a low resource to web-rich language and to devise a unified approach towards Pashto language corpus.

Published in Dawn, May 20th, 2024

Opinion

Editorial

Iran’s new leader
Updated 10 Mar, 2026

Iran’s new leader

The position is the most powerful in Iran, bringing together clerical authority and political and ideological leadership.
National priorities
10 Mar, 2026

National priorities

EVEN as the country faces heightened risks of attacks from actual terrorists, an anti-terrorism court in Rawalpindi...
Silenced march
10 Mar, 2026

Silenced march

ON the eve of International Women’s Day, Islamabad Police detained dozens of Aurat March activists who had ...
War & deception
Updated 09 Mar, 2026

War & deception

While there is little doubt that Iran is involved in many of the retaliatory attacks, the facts raise suspicions that another player may be at work.
The witness box
09 Mar, 2026

The witness box

IT is often the fear of the courtroom and what may transpire therein that drives many victims of crime, especially...
Asylum applications
09 Mar, 2026

Asylum applications

BRITAIN’S tough immigration posture has again drawn attention to the sharp rise in asylum claims by Pakistani...