Kurdish Sorani Dialect Morphology Generation Using a Concatenative Strategy

Authors

  • Kardo O. Aziz Department of Computer, College of Science, Chamro University, Iraq
  • Ramyar A. Teimoor Department of Computer, College of Science, University of Sulaimani, Sulaimanyah, Kurdistan Region, Iraq https://orcid.org/0000-0002-7016-1833
  • Tofiq A. Tofiq Department of Computer, College of Science, University of Sulaimani, Sulaimanyah, Kurdistan Region, Iraq
  • Dilman S. Abdulla Department of Computer, College of Science, University of Halabja, Halabja, Kurdistan Region, Iraq

DOI:

https://doi.org/10.21928/uhdjst.v8n1y2024.pp13-19

Keywords:

Low-resource-language, Kurdish-Sorani, NLP, Word-Generation, Morphology

Abstract

In natural language processing, morphological generation refers to the creation of the appropriate inflected forms of words based on a predetermined set of morphological rules. However, it might be difficult to generate morphology in languages with intricate morphological systems, like the Kurdish Sorani dialect. The concatenative morphology-based unique technique to morphological generation in Kurdish Sorani is proposed in this research. The suggested strategy tries to get over the drawbacks of current approaches and enhance the precision and effectiveness of morphological generation in Kurdish Sorani. The suggested technique generates all conceivable subjective and objective pronouns in both positive and negative forms, together with the various verb tenses for Kurdish morphology. The study presents a detailed examination of Kurdish Sorani’s morphology and points out the difficulties in coming up with the right verbforms. The authors suggest a concatenative morphology-based morphological generating system that comprises of a morphological analyzer and a morphological generator.

References

S. Ahmadi, “KLPT-Kurdish Language Processing Toolkit,” In: Proceedings of Second Workshop for NLP Open-Source Software (NLP-OSS), pp. 72-84.

S. Salavati and S. Ahmadi. “Building a lemmatizer and a spellchecker or Sorani Kurdish”. arXiv preprint, v ol. 1809.10763 , p. 1, 2018.

D. Salih, “Kurdish Sorani Spelling Checker System,” [MA Thesis], University of Birmingham, England, 2016, 2021.

S. Ahmadi, “A formal description of Sorani Kurdish morphology”. ArXiv Preprint, vol. 2109.03942, p. 1, 2021 .

F. I. Kurde de Paris. “The Kurdish Population”. 2017. Available from: https://www.institutkurde.org/en/info/the-kurdishpopulation- 1232551004 [Last accessed on 2023 Dec 24].

R. O. Abdulrahman and H. Hassani, “A Language Model for Spell Checking of Educational Texts in Kurdish (Sorani)”. In: Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages, SIGUL 2022-Held in Conjunction with the International Conference on Language Resources and Evaluation, pp. 189-198, 2022.

H. Fatah and Z. Hamawand. “A prototype approach to Kurdish prefixes”. International Journal on Studies in English Language and Literature, vol. 2, pp. 37-49, 2014.

V. Cavalli-Sforza, A. Soudi, and T. Mitamura. “Arabic Morphology Generation using a Concatenative Strategy”. In: 1st Meet. North American Chapter of the Association for Computational Linguistics. NAACL 2000-co-Located with 6th Applying Natural Language Processing Conference, pp. 86-93, 2000.

R. A. Kareem. “The Syntax of Verbal Inflection in Central Kurdish”. Newcastle University, England, 2016. (Doctoral Dissertation).

S. Ahmadi. “Hunspell for Sorani Kurdish spell checking and morphological analysis”. Arxiv, v ol. 2109.06374, p. 1, 2021.

A. Aqel, S. Alwadei and M. Dahab. “Building an Arabic words generator”. International Journal of Computer Applications,

vol. 112, pp. 36-41, 2015.

D. H. Kim. “A Basic Guide to Kurdish Grammar”. Culture and Language Institute of Kurdi and Kori, Iraq, 2010.

G. Walther. “Fitting into morphological structure: Accounting for Sorani Kurdish endoclitics”. Mediterranean Morphology Meetings, vol. 8, pp. 299-321, 2012.

M. Naserzade, A, Mahmudi, H. Veisi, H. Hosseini, M. MohammadAmini. “CKMorph: A comprehensive morphological analyzer for Central Kurdish”. International Journal of Digital Humanities, vol. ???, pp. 1-46, 2023.

A. M. Saeed, T. A. Rashid, A. M. Mustafa and A. A. Agha. “An evaluation of Reber stemmer with longest match stemmer

technique in Kurdish Sorani text classification. Iran Journal of Computer Science, vol. 1, no. 2, pp. 99-107, 2018.

A. Yoosofan, A. Rahimi, M. Rastgoo and M. M. Mojiri. “Automatic stemming of some Arabic words used in persian through morphological analysis without a dictionary”. World Applied Sciences Journal, vol. 8, no. 9, pp. 1078-1085, 2010.

T. M. T. Sembok and B. A. Ata. “Arabic word stemming algorithms and retrieval effectiveness”. Lecture Notes in Engineering and Computer Science, vol. 3, pp. 1577–1582, 2013.

N. Habash and O. Rambow. “MAGEAD: A morphological analyzer and generator for the Arabic dialects.” In: COLING/ACL 2006- 21st International Conference on Computational Linguistics. 44th Annual Meeting of the Association for Computational Linguistics. vol. 1, pp. 681-688, 2006.

K. O. Aziz. “Kurdish-Morphological-Kurdish-Word, Rules and Source Code”. 2023. Available from: https://github.com/kardoothman/kurdish-morphological-kurdish-word [Last accessed on 2023 Feb 01].

Published

2024-01-10

How to Cite

Aziz, K. O., Abdulrahman Teimoor, R. ., Ahmed Tofiq, T., & Salih Abdulla, D. (2024). Kurdish Sorani Dialect Morphology Generation Using a Concatenative Strategy. UHD Journal of Science and Technology, 8(1), 13–19. https://doi.org/10.21928/uhdjst.v8n1y2024.pp13-19

Issue

Section

Articles