Manageable Phrase-based Statistical Machine Translation Models

Par Conseil national de recherches du Canada

Téléchargement	Voir le manuscrit accepté : Manageable Phrase-based Statistical Machine Translation Models (PDF, 555 Kio)
DOI	Trouver le DOI : https://doi.org/10.1007/978-3-540-75175-5_55
Auteur	Rechercher : Badr, Ghada¹; Rechercher : Joanis, Eric¹; Rechercher : Larkin, Samuel¹; Rechercher : Kuhn, Roland¹
Affiliation	Conseil national de recherches du Canada. Institut de technologie de l'information du CNRC
Format	Texte, Article
Conférence	5th International Conference on Computer Recognition Systems CORES 07, Wroclaw, Poland, October 22-25, 2007
Résumé	Statistical Machine Translation (SMT) is an evolving field where many techniques in Syntactic Pattern Recognition (SPR) are needed and applied. A typical phrase-based SMT system for translating from a T (target) language to an S (source) language contains one or more n-gram language models (LMs) and one or more phrase translation models (TMs). These LMs and TMs have a large memory footprint (up to several gigabytes). This paper describes novel techniques for filtering these models that ensure only relevant patterns in the LMs and TMs are loaded during translation. In experiments on a large Chinese-English task, these techniques yielded significant reductions in the amount of information loaded during translation: up to 58% reduction for LMs, and up to 75% for TMs.
Date de publication	2007
Dans	Computer Recognition Systems 2 (Advances in Intelligent and Soft Computing, vol. 45) (2007) : 437–444.
Langue	anglais
Publications évaluées par des pairs	Oui
Numéro du CNRC	NRCC 49891
Numéro NPARC	9183591
Exporter la notice	Exporter en format RIS
Signaler une correction	Signaler une correction (s'ouvre dans un nouvel onglet)
Identificateur de l’enregistrement	f2a4386f-564f-44d4-9c01-c437390b8bb3
Enregistrement créé	2009-06-30
Enregistrement modifié	2020-05-10

Date de modification :: 2024-07-06