Offline Writer Recognition for Kurdish Handwritten Text Document Based on Proposed Codebook

Authors

  • Twana Latif Mohammed Department of Information Technology, Technical College of Informatics, Sulaimani Polytechnic University, Sulaymaniyah, Kurdistan Region, Iraq
  • Ahmed Abdullah Ahmed Department of Software Engineering, Faculty of Engineering and Computer Science, Qaiwan International University (QIU)/Raparin, Sulaymaniyah, Kurdistan Region, Iraq

DOI:

https://doi.org/10.21928/uhdjst.v5n2y2021.pp21-27

Keywords:

Writer Identification, Feature Extraction, Text Independent, Codebooks, Feature Combination

Abstract

Handwritten text recognition has been an ongoing attractive task to research in the field of document analysis and recognition with applications in handwriting forensics, paleography, document examination, and handwriting recognition. In the present research, an automatic method of writer recognition is presented using digitized images of unconstrained texts. Despite the increasing efforts by prior literature on the different methods used for the same purpose, such methods performance, particularly their accuracy, has not been promising, leaving plenty of room for improvements. This method made use of codebook-based writer characterization, with each writing sample represented by a group of computed features from a primary and secondary codebook. The writings were then represented through the computation of the probability of codebook patterns occurrence, and the probability distribution was employed for each writer’s characterization. Writer identification process involved comparing two writings through the computation of the distances between their respective probability distribution. The study carried out experiments to determine the performance of the implemented method in light of rates of identification with the help of standard datasets, namely, KRDOH and IAM, the former being the most current and largest Kurdish handwritten datasets with 1076 writers, and the latter being a dataset containing 650 writers. The outcome of the experiments was promising with a rate of identification of 94.3%, with the proposed method outperforming the state-of-the-art methods by 2–3%.

References

[1] M. M. Holland and T. J. Parsons. “Mitochondrial DNA sequence analysis validation and use for forensic casework”. Forensic Sci. Rev., vol. 11, no. 1, pp. 21-50, 1999.
[2] Z. Abu-faraj, D. P. A. Atie, K. Chebaklo, S. Member and Z. E. Khoukaz. “Fingerprint Identification Software for Forensic Applications”. Electronics, Circuits and Systems, 2000. ICECS 2000. The 7th IEEE International Conferenceno. May, 2010.
[3] S. Black and T. J. U. Thompson. “Body Modification”. CRC Press, Boca Raton, 2007.
[4] W. Zhao, R. Chellappa, P. J. Phillips and A. Rosenfeld. “Face recognition: A literature survey”. ACM Computing Surveys, vol. 35, no. 4, pp. 399-458, 2003.
[5] J. Daugman. “How iris recognition works”. IEEE Journal, vol. 14, no. 1, pp. 21-30, 2004.
[6] R. Zewail, A. Elsafi, M. Saeb and N. Hamdy. “Soft and Hard Biometrics Fusion for Improved Identity Verification”. The 2004 47th Midwest Symposium on Circuits and Systems, pp. 225-228, 2004.
[7] C. Champod and D. Meuwly. “The inference of identity in forensic speaker recognition”. Speech Communication, vol. 31, pp. 193-203, 2000.
[8] G. R. Joaquin and D. Ramos. Forensic automatic speaker classification in the coming paradigm shift. In: Speaker Classification I. Springer, Berlin, Heidelberg, pp. 205-217, 2007.
[9] J. K. Aggarwal and Q. Cai. “Human motion analysis: A review”. Computer Vision and Image Understanding, vol. 73, no. 3, pp. 428-440, 1999.
[10] M. G. Grant, J. D. Shutler, M. S. Nixon and J. N. Carter. “Analysis of a Human Extraction System for Deploying Gait Biometrics”. 6th IEEE Southwest Symposium on Image Analysis and Interpretation, pp. 46-50, 2004.
[11] K. Delac and M. Grgic. “A Survey of Biometric Recognition Methods”. 46th International Symposium Electronics in Marineno, pp. 16-18, 2004.
[12] R. A. Huber and A. M. Headrick. “Handwriting identification: Facts and fundamentals”. CRC Press, Boca Raton, Florida, 1999.
[13] A. Garz, M. Würsch and A. Fischer. “Simple and Fast Geometrical Descriptors for Writer Identification”. Society for Imaging Science and Technology, Springfield, Virginia, pp. 1-12, 2016.
[14] S. Al-Maadeed, A. Hassaine, A. Bouridane and M. A. Tahir. “Novel geometric features for off-line writer identification”. Pattern Analysis and Applications, vol. 19, no. 3, pp. 699-708, 2016.
[15] C. Shi-Ming and W. Yi-Song. “A robust off-line writer identification method”. Renhe Test, vol. 46, no. 1, pp. 108-116, 2020.
[16] A. Chahi, Y. Ruichek and R. Touahni. “Local gradient fullscale transform patterns based off-line text-independent writer identification”. Applied Soft Computing, vol. 2020, p. 106277, 2020.
[17] A. Forn, D. Albert and G. Josep. “CVC-MUSCIMA: A ground-truth of handwritten music score images for writer identification and staff removal”. International Journal on Document Analysis and Recognition, vol. 15, pp. 243-251, 2012.
[18] A. A. Ahmed and G. Sulong. “Arabic writer identification: A review of literature”. Journal of Theoretical and Applied Information Technology, vol. 69, no. 3, pp. 474-484.
[19] G. J. T. Rahim and M. S. M. Rahim. “Off-line text-independent writer recognition for chinese handwriting: A review”. Jurnal Teknologi, vol. 2, pp. 39-50, 2015.
[20] S. M. Awaida and S. A. Mahmoud. “State of the art in off-line writer identification of handwritten text and survey of writer identification of Arabic text”. Educational Research Review, vol. 7, no. 20, pp. 445-463, 2012.
[21] A. Junaidi, S. Trianingsih and M. Iqbal. “Writer identification of lampung handwritten documents based on selected characters”. Khazanah Informatika: Jurnal Ilmu Komputer dan Informatika, vol. 6, no. 1, pp. 1-8.
[22] G. Ghiasi and R. Safabakhsh. “Offline text-independent writer identification using codebook and efficient code extraction methods”. Image and Vision Computing, vol. 31, no. 5, pp. 379-391, 2013.
[23] T. L. Mohammed, A. A. Ahmed and O. I. Al-Sanjary. “KRDOH: Kurdish Offline Handwritten Text Database”. In: 2019 IEEE 7th Conference on Systems, Process and Control (ICSPC), pp. 86-89, 2019.
[24] U. V. Marti and H. Bunke. “The IAM-database: An English sentence database for offline handwriting recognition”. International Journal on Document Analysis and Recognition, vol. 5, no. 1, pp. 39-46, 2002.
[25] A. Durou, I. Aref, S. Al-Maadeed, A. Bouridane and E. Benkhelifa. Writer identification approach based on bag of words with OBI features”. Information Processing and Management, vol. 56, no. 2, pp. 354-366, 2019.
[26] H. T. Nguyen, C. T. Nguyen, T. Ino, B. Indurkhya and M. Nakagawa. “Text-independent writer identification using convolutional neural network”. Pattern Recognition Letters, vol. 121, pp. 104-112, 2019.

Published

2021-03-31

Issue

Section

Articles