Combination of Arabic Preprocessing Schemes for Statistical Machine Translation

Download	View accepted manuscript: Combination of Arabic Preprocessing Schemes for Statistical Machine Translation (PDF, 276 KiB)
Author	Search for: Sadat, F.; Search for: Habash, N.
Format	Text, Article
Conference	International Committee on Computational Linguistics and the Association for ComputationalLinguistics (COLING/ACL 2006), July 17-21, 2006, Sydney, Australia
Abstract	Statistical machine translation is quite robust when it comes to the choice of input representation. It only requires consistency between training and testing. As a result, there is a wide range of possible preprocessing choices for data used in statistical machine translation. This is even more so for morphologically rich languages such as Arabic. In this paper, we study the effect of different word-level preprocessing schemes for Arabic on the quality of phrase-based statistical machine translation. We also present and evaluate different methods for combining preprocessing schemes resulting in improved translation quality.
Publication date	2006
In	Proceedings of the International Committee on Computational Linguistics and the Association for ComputationalLinguistics (COLING/ACL 2006).
Language	English
NRC number	NRCC 48757
NPARC number	8913505
Export citation	Export as RIS
Report a correction	Report a correction (opens in a new tab)
Record identifier	21a83ebf-dbc5-49f6-9613-e92b3ecd276a
Record created	2009-04-22
Record modified	2020-10-09