Construction of Alphabetic Character Recognition Systems: A Review

Authors

  • Hamsa D. Majeed Department of Information Technology, College of Science and Technology, University of Human Development, Kurdistan Region, Iraq
  • Goran Saman Nariman Department of Information Technology, College of Science and Technology, University of Human Development, Kurdistan Region, Iraq

DOI:

https://doi.org/10.21928/uhdjst.v7n1y2023.pp32-42

Keywords:

Optical Character Recognition, Script Identification, Document Analysis, Character Recognition, Multi-Script Documents

Abstract

Character recognition (CR) systems were attracted by a massive number of authors’ interest in this field, and lot of research has been proposed, developed, and published in this regard with different algorithms and techniques due to the great interest and demand of raising the accuracy of the recognition rate and the reliability of the presented system. This work is proposed to provide a guideline for CR system construction to afford a clear view to the authors on building their systems. All the required phases and steps have been listed and clarified within sections and subsections along with detailed graphs and tables beside the possibilities of techniques and algorithms that might be used, developed, or merged to create a high-performance recognition system. This guideline also could be useful for readers interested in this field by helping them extract the information from such papers easily and efficiently to reach the main structure along with the differences between the systems. In addition, this work recommends to researchers in this field to comprehend a specified categorical table in their work to provide readers with the main structure of their work that shows the proposed system’s structural layout and enables them to easily find the information and interests.

References

M. Paolanti and E. Frontoni. “Multidisciplinary pattern recognition applications: A review”. Computer Science Review, vol. 37, pp. 100276, 2020.

M. Kawaguchi, K. Tanabe, K. Yamada, T. Sawa, S. Hasegawa, M. Hayashi and Y. Nakatani. “Determination of the Dzyaloshinskii- Moriya interaction using pattern recognition and machine learning”. npj Computational Materials, vol. 7, no. 1, 2021.

B. Biggio and F. Roli. “Wild patterns: Ten years after the rise of adversarial machine learning”. Pattern Recognition, vol. 84, pp. 317-331, 2018.

T. S. Gorripotu, S. Gopi, H. Samalla, A. V. Prasanna and B. Samira. “Applications of Computational Intelligence Techniques for Automatic Generation Control Problem-a Short Review from 2010 to 2018.” In: Computational Intelligence in Pattern Recognition. Springer Singapore, Singapore, 2020, pp. 563-578.

M. I. Sharif, J. P. Li, J. Naz and I. Rashid. “A comprehensive review on multi-organs tumor detection based on machine learning”. Pattern Recognition Letters, vol. 131, pp. 30-37, 2020.

A. Nakanishi. “Writing Systems of the World: Alphabets, Syllabaries, Pictograms”. Charles E. Tuttle Co., United States, 1980.

F. Coulmas. “The Blackwell Encyclopedia of Writing Systems”. Blackwell, London, England, 1999.

D. Sinwar, V. S. Dhaka, N. Pradhan and S. Pandey. “Offline script recognition from handwritten and printed multilingual documents: A survey”. International Journal on Document Analysis and Recognition, vol. 24, no. 1-2, pp. 97-121, 2021.

D. Ghosh, T. Dube and A. P. Shivaprasad. “Script recognition- -a review”. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 12, pp. 2142-2161, 2010.

K. Ubul, G. Tursun, A. Aysa, D. Impedovo, G. Pirlo and I. Yibulayin. “Script identification of multi-script documents: A survey”. IEEE

Access, vol. 5, pp. 6546-6559, 2017.

C. Tsai. “Recognizing handwritten Japanese Characters using Deep Convolutional Neural Networks”. University of Stanford in Stanford, California, pp. 405-410, 2016.

S. Purnamawati, D. Rachmawati, G. Lumanauw, R. F. Rahmat and R. Taqyuddin. “Korean letter handwritten recognition using deep convolutional neural network on android platform”. Journal of Physics Conference Series, vol. 978, no. 1, p. 012112, 2018.

Y. Q. Li, H. S. Chang and D. T. Lin. “Large-scale printed Chinese character recognition for ID cards using deep learning and few samples transfer learning”. Applied Sciences, vol. 12, no. 2, p. 907, 2022.

B. Robertson and F. Boschetti. “Large-scale optical character recognition of ancient Greek”. Mouseion Journal of the Classical Association of Canada, vol. 14, no. 3, pp. 341-359, 2017.

J. Hocking and M. Puttkammer. “Optical Character Recognition for South African languages”. In: 2016 Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference (PRASA-RobMech), 2016.

A. Fornes, V. Romero, A. Baró, J. I. Toledo, J. A. Sánchez, E. Vidal, J. Lladós. “ICDAR2017 Competition on Information Extraction in Historical Handwritten Records”. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), 2017.

H. van Halteren and N. Speerstra. “Gender recognition on Dutch tweets”. Computational Linguistics in the Netherlands Journal, vol. 4, pp. 171-190, 2019.

H. D. Majeed and G. S. Nariman. “Offline handwritten English alphabet recognition (OHEAR)”. UHD Journal of Science and Technology, vol. 6, no. 2, pp. 29-38, 2022.

K. Todorov and G. Colavizza. “An Assessment of the Impact of OCR Noise on Language Models”. In: Proceedings of the 14th International Conference on Agents and Artificial Intelligence, 2022.

M. Del Buono, L. Boatto, V. Consorti, V. Eramo, A. Esposito, F. Melcarne and M. Tucci. “Recognition of Handprinted Characters in Italian Cadastral Maps”. In: Character Recognition Technologies. SPIE Proceedings, 1993. vol. 1906, pp. 89-99.

R. Barman, M. Ehrmann, S. Clematide, S. A. Oliveira and F. Kaplan, “Combining visual and textual features for semantic segmentation of historical newspapers. Journal of Data Mining and Digital Humanities, 2021.

F. Lopes, C. Teixeira and H. G. Oliveira. “Comparing different methods for named entity recognition in Portuguese neurology text”. Journal of Medical Systems, vol. 44, no. 4, p. 77, 2020.

N. Alrasheed, P. Rao and V. Grieco. “Character Recognition of seventeenth-century Spanish American notary records using deep learning”. Digital Humanities Quarterly, vol. 15, no. 4, 2021.

T. Q. Vinh, L. H. Duy and N. T. Nhan. “Vietnamese handwritten character recognition using convolutional neural network”. IAES International Journal of Artificial Intelligence, vol. 9, no. 2, pp. 276- 283, 2020.

A. Chaudhuri, K. Mandaviya, P. Badelia and S. K. Ghosh. “Optical character recognition systems for German language.” In: Optical Character Recognition Systems for Different Languages with Soft Computing. Cham, Springer International Publishing, 2017, pp. 137-164.

D. Gunawan, D. Arisandi, F. M. Ginting, R. F. Rahmat and A. Amalia. “Russian character recognition using self-organizing map”. Journal of Physics: Conference Series, vol. 801, p. 012040, 2017.

G. Georgiev, P. Nakov, K. Ganchev, P. Osenova and K. I. Simov. “Feature-rich Named Entity Recognition for Bulgarian using Conditional Random Fields”. In: Proceedings of the International Conference RANLP-2009. arXiv [cs.CL], 2021.

A. Radchenko, R. Zarovsky and V. Kazymyr, “Method of Segmentation and Recognition of Ukrainian License Plates”. In: 2017 IEEE International Young Scientists Forum on Applied Physics and Engineering (YSF), 2017.

M. Gjoreski, G. Zajkovski, A. Bogatinov, G. Madjarov, D. Gjorgjevikj and H. Gjoreski. “Optical Character Recognition Applied on Receipts Printed in Macedonian Language”. In: International Conference on Informatics and Information Technologies (CIIT), 2014.

T. Ghukasyan, G. Davtyan, K. Avetisyan and I. Andrianov. “PioNER: Datasets and Baselines for Armenian Named Entity Recognition”. In: 2018 Ivannikov Ispras Open Conference (ISPRAS), 2018.

N. Alrobah and S. Albahli. “Arabic handwritten recognition using deep learning: A survey”. Arabian Journal for Science and Engineering, 2022.

O. Keren, T. Avinari, R. Tsarfaty and O. Levy, “Breaking Character: Are Subwords Good Enough for MRLs after all?” arXiv [cs.CL], 2022.

Y. A. Nanehkaran, D. Zhang, S. Salimi, J. Chen, Y. Tian and N. Al- Nabhan. “Analysis and comparison of machine learning classifiers and deep neural networks techniques for recognition of Farsi handwritten digits”. Journal of Supercomputing, vol. 77, no. 4, pp. 3193-3222, 2021.

D. Rashid and N. Kumar Gondhi. “Scrutinization of Urdu handwritten text recognition with machine learning approach”. In: Communications in Computer and Information Science. Cham, Springer International Publishing, 2022, pp. 383-394.

Y. Wang, H. Mamat, X. Xu, A. Aysa and K. Ubul. Scene Uyghur text detection based on fine-grained feature representation”. Sensors (Basel), vol. 22, no. 12, p. 4372, 2022.

S. Sharma and S. Gupta. “Recognition of various scripts using machine learning and deep learning techniques-A review”. In: 2021 6th International Conference on Signal Processing, Computing and Control (ISPCC), 2021.

P. D. Doshi and P. A. Vanjara. “A Comprehensive survey on Handwritten Gujarati Character and its Modifier Recognition Methods”. In: Information and Communication Technology for Competitive Strategies (ICTCS 2020). Springer Singapore, Singapore, 2022, pp. 841-850.

M. R. Haque, M. G. Azam, S. M. Milon, M. S. Hossain, M. A. A. Molla and M. S. Uddin. “Quantitative Analysis of deep CNNs for Multilingual Handwritten Digit Recognition”. In: Advances in Intelligent Systems and Computing. Singapore: Springer Singapore, 2021, pp. 15-25.

H. Singh, R. K. Sharma and V. P. Singh. “Online handwriting recognition systems for Indic and non-Indic scripts: A review”. Artificial Intelligence Review, vol. 54, no. 2, pp. 1525-1579, 2021.

L. Saysourinhong, B. Zhu and M. Nakagawa. “Online handwritten Lao character recognition by MRF”. IEICE Transactions on Information and Systems, vol. E95.D, no. 6, pp. 1603-1609, 2012.

C. S. Lwin and W. Xiangqian. “Myanmar Handwritten Character Recognition from Similar Character Groups using K-means and Convolutional Neural Network”. In: 2020 IEEE 3rd International Conference on Electronics and Communication Engineering (ICECE), 2020.

M. A. Rasyidi, T. Bariyah, Y. I. Riskajaya and A. D. Septyani. “Classification of handwritten Javanese script using random forest

algorithm”. Bulletin of Electrical Engineering and Informatics, vol. 10, no. 3, pp. 1308-1315, 2021.

I. W. A. Darma and N. K. Ariasih. “Handwritten Balinesse Character Recognition using K-Nearest Neighbor”. INA-Rxiv, 2018.

J. Park, E. Lee, Y. Kim, I. Kang, H. I. Koo and N. I. Cho. “Multi-lingual optical character recognition system using the reinforcement learning of character segmenter”. IEEE Access, vol. 8, pp. 174437- 174448, 2020.

R. Plamondon and S. N. Srihari. “Online and off-line handwriting recognition: A comprehensive survey”. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 63- 84, 2000.

N. S. Guptha, V. Balamurugan, G. Megharaj, K. N. A. Sattar and J. D. Rose, “Cross lingual handwritten character recognition using long short term memory network with aid of elephant herding optimization algorithm”. Pattern Recognition Letters, vol. 159, pp. 16-22, 2022.

G. S. Katkar and M. V Kapoor. “Performance analysis of structure similarity algorithm for the recognition of printed cursive English alphabets”. International Journal of Scientific Research in Science and Technology, vol.8, no.5, pp. 555-559, 2021.

S. Tabassum, N. Abedin, M. M. Rahman, M. M. Rahman, M. T. Ahmed, R. I. Maruf and A. Ahmed. “An online cursive handwritten medical words recognition system for busy doctors in developing countries for ensuring efficient healthcare service delivery”. Scientific Reports, vol. 12, no. 1, p. 3601, 2022.

D. H. Wang, C. L. Liu, J. L. Yu and X. D. Zhou. “CASIA-OLHWDB1: A Database of Online Handwritten Chinese Characters”. In: 2009 10th International Conference on Document Analysis and Recognition, 2009.

T. Q. Wang, X. Jiang and C. L. Liu. “Query pixel guided stroke extraction with model-based matching for offline handwritten Chinese characters”. Pattern Recognition, vol. 123, p. 108416, 2022.

A. Qaroush, B. Jaber, K. Mohammad, M. Washaha, E. Maali and N. Nayef. “An efficient, font independent word and character segmentation algorithm for printed Arabic text”. Journal of King Saud University-Computer and Information Sciences, vol. 34, no. 1, pp. 1330-1344, 2022.

K. M. M. Yaagoup and M. E. M. Musa. “Online Arabic handwriting characters recognition using deep learning”. International Journal of Advanced Research in Computer and Communication Engineering, vol. 9, no. 10, pp. 83-92, 2020.

P. B. Pati, S. Sabari Raju, N. Pati and A. G. Ramakrishnan. “Gabor Filters for Document Analysis in Indian Bilingual Documents.” In: International Conference on Intelligent Sensing and Information Processing, 2004. Proceedings of, 2004, pp. 123-126

S. M. Obaidullah, C. Halder, N. Das sand K. Roy. “Numeral script identification from handwritten document images”. Procedia Computer Science, vol. 54, pp. 585-594, 2015.

R. Bashir and S. Quadri. “Identification of Kashmiri Script in a Bilingual Document Image”. In: 2013 IEEE Second International Conference on Image Information Processing (ICIIP-2013), 2013.

S. Manjula and R. S. Hegadi. “Identification and Classification of Multilingual Document using Maximized Mutual Information”. In: 2017 International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS), 2017.

K. Roy, O. M. Sk, C. Halder, K. Santosh and N. Das. “Automatic line-level script identification from handwritten document images-a region-wise classification framework for Indian subcontinent”. Malaysian Journal of Computer Science, vol. 31, no. 1, p. 10, 2016.

G.S. Rao, M. Imanuddin and B. Harikumar. “Script Identification of Telugu, English and Hindi document image”. International Journal of Advanced Engineering and Global Technology, vol. 2, no. 2, pp. 443-452, 2014.

E. O. Omayio, I. Sreedevi and J. Panda. “Word Segmentation by Component Tracing and Association (CTA) Technique”. Journal of Engineering Research, 2022.

P. K. Singh, R. Sarkar and M. Nasipuri. “Offline script identification from multilingual Indic-script documents: A state-of-the-art”. Computer Science Review, vol. 15-16, pp. 1-28, 2015.

Y. Baek, D. Nam, S. Park, J. Lee, S. Shin, J. Baek, C. Y. Lee and H. Lee. “CLEval: Character-level Evaluation for Text Detection and Recognition Tasks”. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2020.

K. J. Taher and H. D. Majeed. “Recognition of handwritten English numerals based on combining structural and statistical features”. Iraqi Journal of Computers, Communications, Control and Systems Engineering, vol. 21, no. 1, pp. 73-83, 2021.

D. Sinwar, V. S. Dhaka, N. Pradhan and S. Pandey. “Offline script recognition from handwritten and printed multilingual documents: A survey”. International Journal on Document Analysis and Recognition, vol. 24, no. 1-2, pp. 97-121, 2021.

Sakshi and V. Kukreja. “A retrospective study on handwritten mathematical symbols and expressions: Classification and recognition”. Engineering Applications of Artificial Intelligence, vol. 103, p. 104292, 2021.

N. Murugan, R. Sivakumar, G. Yukesh and J. Vishnupriyan. “Recognition of Character from Handwritten”. In: 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), 2020, pp. 1417-1419.

R. Parthiban, R. Ezhilarasi and D. Saravanan. “Optical Character Recognition for English Handwritten Text using Recurrent Neural Network”. In: 2020 International Conference on System, Computation, Automation and Networking (ICSCAN), 2020.

H. Q. Ung, C. T. Nguyen, K. M. Phan, V. T. M. Khuong and M. Nakagawa. “Clustering online handwritten mathematical expressions”. Pattern Recognition Letters, vol. 146, pp. 267-275, 2021.

N. Gautam and S. S. Chai. “Zig-zag diagonal and ANN for English character recognition”. International Journal of Advanced Trends in Computer Science and Engineering, vol. 8, no. 1.4, pp. 57-62, 2019.

L. Deng. “The MNIST database of handwritten digit images for machine learning research [best of the web]”. IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 141-142, 2012.

Published

2023-02-18

How to Cite

D. Majeed, H., & Nariman, G. S. (2023). Construction of Alphabetic Character Recognition Systems: A Review. UHD Journal of Science and Technology, 7(1), 32–42. https://doi.org/10.21928/uhdjst.v7n1y2023.pp32-42

Issue

Section

Articles