Hybrid Arabic-English Machine Translation to Solve Reordering and Ambiguity Problems


  • Khalid Shaker Alubaidi Department of Computer Science, Computer College, University of Anbar, Iraq




Machine translation, Arabic-English machine translation, Hybrid Machine Translation


The problem in Arabic to English rule-based machine translation is that the rule-based lexical analyzer leaves some amount of ambiguity; therefore a statistical approach is used to resolve the ambiguity problem. Rule Based Machine Translation (RBMT) uses linguistic rule between two languages which is built manually by human in general, whereas SMT uses appearance statistic of word in parallel corpora. In this paper, those different approaches are combined into Arabic-English Hybrid Machine Translation (HMT) system to get the advantage from both kind of information. In the beginning, Arabic text will be inputted into RBMT to solve reordering problem. Then, the output will be edited by SMT to solve the ambiguity problem and generate the final translation of English text. SMT is capable to do this because on the training process, it uses RBMT’s output (English) as source material and real translation (English) as target material. The results showed that the quality of translation in HMT system is better than SMT system.


[1] Charoenpornsawat, P., Sornlertlamvanich, V., Charoenporn, T.: Improving Translation Quality of Rule-based Machine Translation. In: Proceedings of COLING Workshop on Machine Translation in Asia, pp. 351-356, Taiwan (2002).
[2] Barkade, V. M., Devale, P. R.: English to Sankrit Machine Translation Semantic Mapper. In: International Journal of Engineering Science and Technology Vol. 2 Issue 10, pp. 5313-5318 (2010).
[3] Carrera, J., Beregovaya, O., Yanishevsky, A.: Machine Translation for Cross-Language Social Media. http://www.promt.com/company/technology/pdf/machine_translation_for_cross_language_social_media.pdf (2009).
[4] Corbi-Bellot, A. M., Forcada, M. L., Ortiz-Rojas, S., Perez-Ortiz, J. A., Ramirez-Sanchez, G., Sanchez- Martinez, F., Alegria, I., Mayor, A., Sarasola, K.: An Open-Source Shallow-Transfer Machine Translation Engine for the Romance Languages of Spain. In: Proceedings of the Tenth Conference of the European Association for Machine Translation, pp. 79-86, (2005).
[5] Rangelov, T.: Rule-based Machine Translation between Bulgarian and Macedonian. In: Proceedings of the Second International Workshop on Free/Open-Source Rule-Based Machine Translation, pp. 53-59, Barcelona (2011).
[6] Larasati, S. D., Kuboň, V.: A Study of Indonesian-to- Malaysian MT System. In: Proceedings of the 4th International MALINDO Workshop, Jakarta (2010).
[7] Carpuat, M., Wu, D.: Improving Statistical Machine Translation using Word Sense Disambiguation. In: Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural
Language Learning (EMNLP-CoNLL), pp. 61-72 (2005).
[8] Farrús, M., Mariño, J. B., Poch, M., Hernández, A., Henríquez, C., Fonollosa, J. A. R., Costa-Jussà, M. R.: Overcoming Statistical Machine Translation Limitations: Error Analysis and Proposed Solutions for the Catalan---Spanish Language Pair. In: Journal Language Resources and Evaluation, Vol. 45 Issue 2, pp. 181-208 (2011).
[9] Simard, M., Ueffing, N., Isabelle, P., Kuhn, R.: Rulebased Translation with Statistical Phrase-based Postediting. In: Proceedings of the Second Workshop on Statistical Machine Translation, pp. 203-206, Prague (2007).
[10] Dugast, L., Snellart, J., Koehn, P.: Statistical Post- Editing on SYSTRAN’s Rule-Based Translation System. In: Proceedings of the Second Workshop on Statistical Machine Translation, pp. 220-223, Prague (2007.)
[11] Chen, Y., Eisele, A., Federman, C., Hasler, E., Jellinghaus M., Theison, S.: Multi-Engine Machine Translation with an Open-Source Decoder for Statistical Machine Translation. In: Proceedings of the Second Workshop on Statistical Machine Translation, pp. 193-196, Prague (2007).
[12] Eisele, A., Federman, C., Saint-Amand, H., Jellinghaus, M., Hermann, T., Chen, Y.: Using Moses to Integrate Multiple Rule-Based Machine Translation Engines into a Hybrid System. In: Proceedings of the Third Workshop on Statistical Machine Translation, pp. 179- 182, Colombus (2008).
[13] Hutchins, W. J., Somers, H. L.: An Introduction to Machine Translation Vol. 362. Academic Press. New York (1992).
[14] Alqudsi A, Omar N., and Shaker K. 2012, "Arabic Machine Translation: a Survey", Artificial Intelligence Review (July 2012), pp.1-24.
[15] Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: Open Source Toolkit for Statistical Machine Translation. In: Proceedings of the ACL 2007 Demo and Poster Sessions, pp. 177-180, Prague (2007).
[16] Koehn, P.: Moses Statistical Machine Translation System: User Manual and Code Guide. http://www.statmt.org (2010).
[17] Hatem A, Omar N (2010) Syntactic reordering for Arabic-English phrase-based machine translation. In: Database theory and application, bio-science and bio-technology. Springer Lecture Notes in Computer Science, vol 118. Verlag, Berlin, pp 198–206.
[18] Yulianti, M. Adriani, H. M. Manurung, I. Budi, and A. N. Hidayanto, "Developing Indonesian-English Hybrid Machine Translation System," in Proc. International Conference on Advanced Computer Science and Information System, 2011, pp. 265-270.