A Review on Devanagari OCR for Handwritten Text
DOI:
https://doi.org/10.13052/jgeu0975-1416.1419Keywords:
Optical character recognition (OCR), Devanagari script, text detection, text segmentation, script identification, handwritten text recognitionAbstract
With the rapid growth of document digitization and multilingual content, robust Optical Character Recognition (OCR) systems have become increasingly important. While substantial progress has been achieved for Latin scripts, accurate text detection and recognition in complex Indic scripts, particularly Devanagari, remain challenging due to script-specific structural characteristics, handwriting variations, and limited benchmark datasets. This review paper presents a comprehensive and structured analysis of existing research on text detection, script identification, and handwritten numeral, character, and word recognition for the Devanagari script. The paper systematically categorizes and compares conventional image processing methods, machine learning techniques, and modern deep learning-based approaches, highlighting their strengths and limitations. In addition, key challenges related to segmentation, multilingual scenarios, degraded documents, and resource constraints are critically discussed. By identifying open research gaps and outlining potential future research directions, this work aims to serve as a valuable reference and roadmap for researchers and practitioners working on Devanagari OCR and multilingual document analysis.
Downloads
References
S. Singh, N. K. Garg, and M. Kumar, “Feature extraction and classification techniques for handwritten Devanagari text recognition: a survey,” Multimedia Tools and Applications, vol. 82, pp. 747–775, 2023.
M. Jabde, C. H. Patil, A. D. Vibhute, and J. R. Saini, “A systematic review of multilingual numeral recognition systems,” Artif Intell Rev, vol. 58, no. 4, p. 106, Jan. 2025, doi: 10.1007/s10462-025-11105-0.
R. Plamondon and S. N. Srihari, “Online and off-line handwriting recognition: a comprehensive survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 63–84, 2000.
S. Rani, “Recognition of Gurmukhi handwritten manuscripts,” 2015.
A. S. Ramteke and M. E. Rane, “Offline handwritten devanagari script segmentation,” International Journal of Scientific & Technology Research, vol. 1, no. 4, pp. 142–145, 2012.
R. Smith, D. Antonova, and D.-S. Lee, “Adapting the Tesseract open source OCR engine for multilingual OCR,” in Proceedings of the International Workshop on Multilingual OCR, 2009, pp. 1–8.
S. R. Narang, M. K. Jindal, and M. Kumar, “Drop flow method: an iterative algorithm for complete segmentation of Devanagari ancient manuscripts,” Multimedia Tools and Applications, vol. 78, no. 16, pp. 23255–23280, 2019.
B. Epshtein, E. Ofek, and Y. Wexler, “Detecting text in natural scenes with stroke width transform,” in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, 2010, pp. 2963–2970.
L. Neumann and J. Matas, “A method for text localization and recognition in real-world images,” in Asian Conference on Computer Vision, Springer, 2010, pp. 770–783.
C. Yao, X. Bai, W. Liu, Y. Ma, and Z. Tu, “Detecting texts of arbitrary orientations in natural images,” in 2012 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2012, pp. 1083–1090.
W. Huang, Z. Lin, J. Yang, and J. Wang, “Text localization in natural images using stroke feature transform and text covariance descriptors,” in Proceedings of the IEEE International Conference on Computer Vision, 2013, pp. 1241–1248.
L. Neumann and J. Matas, “Real-time scene text localization and recognition,” in 2012 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2012, pp. 3538–3545.
X.-C. Yin, X. Yin, K. Huang, and H.-W. Hao, “Robust text detection in natural scene images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 5, pp. 970–983, 2013.
Z. Zhang, W. Shen, C. Yao, and X. Bai, “Symmetry-based text line detection in natural scenes,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 2558–2567.
M. Busta, L. Neumann, and J. Matas, “Fastext: Efficient unconstrained scene text detector,” in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 1206–1214.
D. Dhar et al., “Multilingual scene text detection using gradient morphology,” International Journal of Computer Vision and Image Processing (IJCVIP), vol. 10, no. 3, pp. 31–43, 2020.
A. Coates et al., “Text detection and character recognition in scene images with unsupervised feature learning,” in 2011 International Conference on Document Analysis and Recognition, IEEE, 2011, pp. 440–445.
M. Jaderberg, A. Vedaldi, and A. Zisserman, “Deep features for text spotting,” in European Conference on Computer Vision (ECCV), 2016, pp. 512–528.
W. Huang, Y. Qiao, and X. Tang, “Robust scene text detection with convolution neural network induced mser trees,” in European Conference on Computer Vision, Springer, 2014, pp. 497–511.
M. Jaderberg, K. Simonyan, A. Vedaldi, and A. Zisserman, “Reading text in the wild with convolutional neural networks,” International Journal of Computer Vision, vol. 116, no. 1, pp. 1–20, 2016.
A. Gupta, A. Vedaldi, and A. Zisserman, “Synthetic data for text localisation in natural images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2315–2324.
Z. Zhang, C. Zhang, W. Shen, C. Yao, W. Liu, and X. Bai, “Multi-oriented text detection with fully convolutional networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4159–4167.
X. Zhou et al., “East: an efficient and accurate scene text detector,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 5551–5560.
Z. Tian, W. Huang, T. He, P. He, and Y. Qiao, “Detecting text in natural image with connectionist text proposal network,” in European Conference on Computer Vision, Springer, 2016, pp. 56–72.
S. Mahajan and R. Rani, “Text detection and localization in scene images: a broad review,” Artificial Intelligence Review, vol. 54, no. 6, pp. 4317–4377, 2021.
S. Long, X. He, and C. Yao, “Scene text detection and recognition: The deep learning era,” International Journal of Computer Vision, vol. 129, no. 1, pp. 161–184, 2021.
T. Khan, R. Sarkar, and A. F. Mollah, “Deep learning approaches to scene text detection: a comprehensive review,” Artificial Intelligence Review, vol. 54, no. 5, pp. 3239–3298, 2021.
C. Yao, X. Bai, N. Sang, X. Zhou, S. Zhou, and Z. Cao, “Scene text detection via holistic, multi-channel prediction,” arXiv preprint arXiv:1606.09002, 2016.
M. Bisht and R. Gupta, “Fine-Tuned Pre-Trained Model for Script Recognition,” International Journal of Mathematical, Engineering and Management Sciences, vol. 6, no. 5, p. 1297, 2021.
M. Bisht and R. Gupta, “Handwritten Devanagari Word Detection and Localization using Morphological Image Processing,” in 2023 10th International Conference on Signal Processing and Integrated Networks (SPIN), IEEE, 2023, pp. 126–130.
S. Ghosh and B. B. Chaudhuri, “Composite script identification and orientation detection for indian text images,” in 2011 International Conference on Document Analysis and Recognition, IEEE, 2011, pp. 294–298.
M. Verma, N. Sood, P. P. Roy, and B. Raman, “Script identification in natural scene images: a dataset and texture-feature based performance evaluation,” in Proceedings of International Conference on Computer Vision and Image Processing, Springer, 2017, pp. 309–319.
L. Gomez and D. Karatzas, “A fine-grained approach to scene text script identification,” in 2016 12th IAPR Workshop on Document Analysis Systems (DAS), IEEE, 2016, pp. 192–197.
B. Shi, C. Yao, C. Zhang, X. Guo, F. Huang, and X. Bai, “Automatic script identification in the wild,” in 2015 13th International Conference on Document Analysis and Recognition (ICDAR), IEEE, 2015, pp. 531–535.
B. Shi, X. Bai, and C. Yao, “Script identification in the wild via discriminative convolutional neural network,” Pattern Recognition, vol. 52, pp. 448–458, 2016.
D. Ghosh, T. Dube, and A. Shivaprasad, “Script recognition—a review,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 12, pp. 2142–2161, 2010.
K. Ubul, G. Tursun, A. Aysa, D. Impedovo, G. Pirlo, and T. Yibulayin, “Script identification of multi-script documents: a survey,” IEEE Access, vol. 5, pp. 6546–6559, 2017.
N. Sharma, S. Chanda, U. Pal, and M. Blumenstein, “Word-wise script identification from video frames,” in 2013 12th International Conference on Document Analysis and Recognition, IEEE, 2013, pp. 867–871.
N. Sharma, U. Pal, and M. Blumenstein, “A study on word-level multi-script identification from video frames,” in 2014 International Joint Conference on Neural Networks (IJCNN), IEEE, 2014, pp. 1827–1833.
Z. Li and J. Tang, “Unsupervised feature selection via nonnegative spectral analysis and redundancy control,” IEEE Transactions on Image Processing, vol. 24, no. 12, pp. 5343–5355, 2015.
Z. Li, J. Liu, J. Tang, and H. Lu, “Robust structured subspace learning for data representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 10, pp. 2085–2098, 2015.
Z. Li, J. Tang, and X. He, “Robust structured nonnegative matrix factorization for image representation,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 5, pp. 1947–1960, 2017.
J. Mei, L. Dai, B. Shi, and X. Bai, “Scene text script identification with convolutional recurrent neural networks,” in 2016 23rd International Conference on Pattern Recognition (ICPR), IEEE, 2016, pp. 4053–4058.
A. K. Bhunia, A. Konwer, A. K. Bhunia, A. Bhowmick, P. P. Roy, and U. Pal, “Script identification in natural scene image and video frames using an attention based convolutional-LSTM network,” Pattern Recognition, vol. 85, pp. 172–184, 2019.
L. Lu, Y. Yi, F. Huang, K. Wang, and Q. Wang, “Integrating local CNN and global CNN for script identification in natural scene images,” IEEE Access, vol. 7, pp. 52669–52679, 2019.
M. Ma, Q.-F. Wang, S. Huang, S. Huang, Y. Goulermas, and K. Huang, “Residual Attention-Based Multi-Scale Script Identification in Scene Text Images,” Neurocomputing, vol. 421, pp. 222–233, 2021.
M. Tounsi, I. Moalla, F. Lebourgeois, and A. M. Alimi, “CNN based transfer learning for scene script identification,” in International Conference on Neural Information Processing, Springer, 2017, pp. 702–711.
R. Bajaj, L. Dey, and S. Chaudhury, “Devnagari numeral recognition by combining decision of multiple connectionist classifiers,” Sadhana, vol. 27, no. 1, pp. 59–72, 2002.
A. Elnagar and S. Harous, “Recognition of handwritten Hindu numerals using structural descriptors,” Journal of Experimental & Theoretical Artificial Intelligence, vol. 15, no. 3, pp. 299–314, 2003.
R. J. Ramteke and S. C. Mehrotra, “Feature extraction based on moment invariants for handwriting recognition,” in 2006 IEEE Conference on Cybernetics and Intelligent Systems, IEEE, 2006, pp. 1–6.
U. Garain, M. P. Chakraborty, and D. Dasgupta, “Recognition of handwritten indic script using clonal selection algorithm,” in International Conference on Artificial Immune Systems, Springer, 2006, pp. 256–266.
N. Sharma, U. Pal, F. Kimura, and S. Pal, “Recognition of off-line handwritten devnagari characters using quadratic classifier,” in Computer Vision, Graphics and Image Processing, Springer, 2006, pp. 805–816.
M. Hanmandlu and O. R. Murthy, “Fuzzy model based recognition of handwritten numerals,” Pattern Recognition, vol. 40, no. 6, pp. 1840–1854, 2007.
M. Hanmandlu, A. V. Nath, A. C. Mishra, and V. K. Madasu, “Fuzzy model based recognition of handwritten hindi numerals using bacterial foraging,” in 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007), IEEE, 2007, pp. 309–314.
P. M. Patil and T. R. Sontakke, “Rotation, scale and translation invariant handwritten Devanagari numeral character recognition using general fuzzy neural network,” Pattern Recognition, vol. 40, no. 7, pp. 2110–2117, 2007.
U. Pal, N. Sharma, T. Wakabayashi, and F. Kimura, “Handwritten numeral recognition of six popular Indian scripts,” in Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), IEEE, 2007, pp. 749–753.
U. Bhattacharya and B. B. Chaudhuri, “Handwritten numeral databases of Indian scripts and multistage recognition of mixed numerals,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 3, pp. 444–457, 2008.
U. Pal, R. K. Roy, K. Roy, and F. Kimura, “Indian multi-script full pin-code string recognition for postal automation,” in 2009 10th International Conference on Document Analysis and Recognition, IEEE, 2009, pp. 456–460.
S. Basu, N. Das, R. Sarkar, M. Kundu, M. Nasipuri, and D. K. Basu, “A novel framework for automatic sorting of postal documents with multi-script address blocks,” Pattern Recognition, vol. 43, no. 10, pp. 3507–3521, 2010.
G. G. Rajput and S. M. Mali, “Fourier descriptor based isolated Marathi handwritten numeral recognition,” International Journal of Computer Applications, vol. 3, no. 4, pp. 9–13, 2010.
S. Acharya, A. K. Pant, and P. K. Gyawali, “Deep learning based large scale handwritten Devanagari character recognition,” in 2015 9th International Conference on Software, Knowledge, Information Management and Applications (SKIMA), IEEE, 2015, pp. 1–6.
D. Khanduja, N. Nain, and S. Panwar, “A hybrid feature extraction algorithm for devanagari script,” ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 15, no. 1, p. 2, 2016.
M. Bisht and R. Gupta, “Multiclass recognition of offline handwritten Devanagari characters using CNN,” International Journal of Mathematical, Engineering and Management Sciences, vol. 5, pp. 1429–1439, 2020.
S. Arora, D. Bhatcharjee, M. Nasipuri, and L. Malik, “A two stage classification approach for handwritten Devnagari characters,” in International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007), IEEE, 2007, pp. 399–403.
M. Hanmandlu, O. R. Murthy, and V. K. Madasu, “Fuzzy Model based recognition of handwritten Hindi characters,” in 9th Biennial Conference of the Australian Pattern Recognition Society on Digital Image Computing Techniques and Applications (DICTA 2007), IEEE, 2007, pp. 454–461.
U. Pal, N. Sharma, T. Wakabayashi, and F. Kimura, “Off-line handwritten character recognition of devnagari script,” in Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), IEEE, 2007, pp. 496–500.
P. S. Deshpande, L. G. Malik, and S. Arora, “Fine Classification & Recognition of Hand Written Devnagari Characters with Regular Expressions & Minimum Edit Distance Method.,” Journal of Computers, vol. 3, no. 5, pp. 11–17, 2008.
U. Pal, S. Chanda, T. Wakabayashi, and F. Kimura, “Accuracy improvement of Devnagari character recognition combining SVM and MQDF,” in 11th International Conference on Frontiers in Handwriting Recognition, Citeseer, 2008, pp. 367–372.
S. Arora, D. Bhattacharjee, M. Nasipuri, D. K. Basu, M. Kundu, and L. Malik, “Study of different features on handwritten Devnagari character,” in 2009 Second International Conference on Emerging Trends in Engineering & Technology, IEEE, 2009, pp. 929–933.
S. Kumar, “Performance comparison of features on Devanagari hand-printed dataset,” International journal of recent trends in engineering, vol. 1, no. 2, pp. 33–37, 2009.
V. Mane and L. Ragha, “Handwritten character recognition using elastic matching and PCA,” in Proceedings of the International Conference on Advances in Computing, Communication and Control, ACM, 2009, pp. 410–415.
U. Pal, T. Wakabayashi, and F. Kimura, “Comparative study of Devnagari handwritten character recognition using different feature and classifiers,” in 2009 10th International Conference on Document Analysis and Recognition, IEEE, 2009, pp. 1111–1115.
S. Arora, D. Bhattacharjee, M. Nasipuri, D. K. Basu, and M. Kundu, “Recognition of non-compound handwritten Devnagari characters using a combination of MLP and minimum edit distance,” International Journal of Computer Science and Security, vol. 4, no. 1, pp. 107–120, 2010.
S. Puri and S. P. Singh, “An efficient Devanagari character classification in printed and handwritten documents using SVM,” Procedia Computer Science, vol. 152, pp. 111–121, 2019.
S. P. Deore and A. Pravin, “Devanagari Handwritten Character Recognition using fine-tuned Deep Convolutional Neural Network on trivial dataset,” Sâdhanâ, vol. 45, no. 1, p. 243, Sep. 2020, doi: 10.1007/s12046-020-01484-1.
S. D. Pande et al., “Digitization of handwritten Devanagari text using CNN transfer learning–A better customer service support,” Neuroscience Informatics, vol. 2, no. 3, p. 100016, 2022.
M. Bisht and R. Gupta, “Offline handwritten Devanagari modified character recognition using convolutional neural network,” Sâdhanâ, vol. 46, no. 1, pp. 1–4, 2021.
M. Bisht and R. Gupta, “Conditional Generative Adversarial Network for Devanagari Handwritten Character Generation,” in 2021 7th International Conference on Signal Processing and Communication (ICSC), IEEE, 2021, pp. 142–145.
S. Basu, N. Das, R. Sarkar, M. Kundu, M. Nasipuri, and D. K. Basu, “A hierarchical approach to recognition of handwritten Bangla characters,” Pattern Recognition, vol. 42, no. 7, pp. 1467–1484, Jul. 2009, doi: 10.1016/j.patcog.2009.01.008.
S. Malakar, P. Sharma, P. K. Singh, M. Das, R. Sarkar, and M. Nasipuri, “A Holistic Approach for Handwritten Hindi Word Recognition:,” International Journal of Computer Vision and Image Processing, vol. 7, no. 1, pp. 59–78, Jan. 2017, doi: 10.4018/IJCVIP.2017010104.
S. Bhowmik, S. Polley, M. G. Roushan, S. Malakar, R. Sarkar, and M. Nasipuri, “A holistic word recognition technique for handwritten Bangla words,” International Journal of Applied Pattern Recognition, vol. 2, no. 2, pp. 142–159, 2015.
H. Kaur and M. Kumar, “On the recognition of offline handwritten word using holistic approach and AdaBoost methodology,” Multimedia Tools and Applications, vol. 80, no. 7, pp. 11155–11175, 2021.
S. K. Parui and B. Shaw, “Offline handwritten devanagari word recognition: An hmm based approach,” in International Conference on Pattern Recognition and Machine Intelligence, Springer, 2007, pp. 528–535.
B. Shaw, S. K. Parui, and M. Shridhar, “Offline Handwritten Devanagari Word Recognition: A holistic approach based on directional chain code feature and HMM,” in 2008 International Conference on Information Technology, IEEE, 2008, pp. 203–208.
B. Shaw, S. K. Parui, and M. Shridhar, “A segmentation based approach to offline handwritten Devanagari word recognition,” in 2008 International Conference on Information Technology, IEEE, 2008, pp. 256–257.
B. Singh, A. Mittal, M. A. Ansari, and D. Ghosh, “Handwritten Devanagari word recognition: a curvelet transform based approach,” International Journal on Computer Science and Engineering, vol. 3, no. 4, pp. 1658–1665, 2011.
S. Ramachandrula, S. Jain, and H. Ravishankar, “Offline handwritten word recognition in Hindi,” in Proceeding of the workshop on Document Analysis and Recognition, 2012, pp. 49–54.
B. Shaw, U. Bhattacharya, and S. K. Parui, “Combination of features for efficient recognition of offline handwritten devanagari words,” in 2014 14th International Conference on Frontiers in Handwriting Recognition, IEEE, 2014, pp. 240–245.
B. Shaw, U. Bhattacharya, and S. K. Parui, “Offline handwritten Devanagari word recognition: information fusion at feature and classifier levels,” in 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), IEEE, 2015, pp. 720–724.
S. G. Oval and S. Shirawale, “Recognizing handwritten Devanagari words using recurrent neural network,” in Proceedings of the 3rd International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2014, Springer, 2015, pp. 413–421.
S. Kumar, “A study for handwritten Devanagari word recognition,” in 2016 International Conference on Communication and Signal Processing (ICCSP), Melmaruvathur, Tamilnadu, India: IEEE, Apr. 2016, pp. 1009–1014. doi: 10.1109/ICCSP.2016.7754301.
P. P. Roy, A. K. Bhunia, A. Das, P. Dey, and U. Pal, “HMM-based Indic handwritten word recognition using zone segmentation,” Pattern Recognition, vol. 60, pp. 1057–1075, 2016.
K. Dutta, P. Krishnan, M. Mathew, and C. V. Jawahar, “Offline Handwriting Recognition on Devanagari Using a New Benchmark Dataset,” in 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), IEEE, 2018, pp. 25–30.
A. Dwivedi, R. Saluja, and R. K. Sarvadevabhatla, “An OCR for classical Indic documents containing arbitrarily long words,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 560–561.
R. K. Roy, H. Mukherjee, K. Roy, and U. Pal, “CNN based recognition of handwritten multilingual city names,” Multimedia Tools and Applications, vol. 81, no. 8, pp. 11501–11517, 2022.
M. Bisht and R. Gupta, “Offline Handwritten Devanagari Word Recognition Using CNN-RNN-CTC,” SN Computer Science, vol. 4, no. 1, p. 88, 2022.
B. V. Kasuba, D. Kudale, V. Subramanian, P. Chaudhuri, and G. Ramakrishnan, “PLATTER: A Page-Level Handwritten Text Recognition System for Indic Scripts,” Feb. 10, 2025, arXiv: arXiv:2502.06172. doi: 10.48550/arXiv.2502.06172.


