| Electronics | |
| Recognition of Cursive Pashto Optical Digits and Characters with Trio Deep Learning Neural Network Models | |
| Mohammad Arshad1  Abdullah Khan1  Nazri Mohd. Nawi2  Muhammad Zubair Rehman2  | |
| [1] Faculty of Management and Computer Sciences, Institute of Computer Sciences and Information Technology, The University of Agriculture, Peshawar 25120, Pakistan;Soft Computing and Data Mining Centre (SMC), Faculty of Computer Science & Information Technology, University Tun Hussein Onn Malaysia (UTHM), Batu Pahat 86400, Malaysia; | |
| 关键词: OCR; Pashto digits; convolutional neural networks; Deep CNN; LeNet; | |
| DOI : 10.3390/electronics10202508 | |
| 来源: DOAJ | |
【 摘 要 】
Pashto is one of the most ancient and historical languages in the world and is spoken in Pakistan and Afghanistan. Various languages like Urdu, English, Chinese, and Japanese have OCR applications, but very little work has been conducted on the Pashto language in this perspective. It becomes more difficult for OCR applications to recognize handwritten characters and digits, because handwriting is influenced by the writer’s hand dynamics. Moreover, there was no publicly available dataset for handwritten Pashto digits before this study. Due to this, there was no work performed on the recognition of Pashto handwritten digits and characters combined. To achieve this objective, a dataset of Pashto handwritten digits consisting of 60,000 images was created. The trio deep learning Convolutional Neural Network, i.e., CNN, LeNet, and Deep CNN were trained and tested with both Pashto handwritten characters and digits datasets. From the simulations, the Deep CNN achieved 99.42 percent accuracy for Pashto handwritten digits, 99.17 percent accuracy for handwritten characters, and 70.65 percent accuracy for combined digits and characters. Similarly, LeNet and CNN models achieved slightly less accuracies (LeNet; 98.82, 99.15, and 69.82 percent and CNN; 98.30, 98.74, and 66.53 percent) for Pashto handwritten digits, Pashto characters, and the combined Pashto digits and characters recognition datasets, respectively. Based on these results, the Deep CNN model is the best model in terms of accuracy and loss as compared to the other two models.
【 授权许可】
Unknown