Download | - View accepted manuscript: Segment Choice Models: Feature-Rich Models for Global Distortion in Statistical Machine Translation (PDF, 261 KiB)
|
---|
Author | Search for: Kuhn, Roland; Search for: Yuen, D.; Search for: Simard, Michel; Search for: Paul, P.; Search for: Foster, George; Search for: Joanis, Eric; Search for: Johnson, John Howard |
---|
Format | Text, Article |
---|
Conference | Human Language Technology Conference: North American Chapter of the Association for Computational Linguistics Annual Meeting (HLT/NAACL 2006), June 5, 2006, New York City, New York, USA |
---|
Abstract | This paper presents a new approach to distortion (phrase reordering) in phrase-based machine translation (MT). Distortion is modeled as a sequence of choices during translation. The approach yields trainable, probabilistic distortion models that are global: they assign a probability to each possible phrase reordering. These “segment choice” models (SCMs) can be trained on “segment-aligned” sentence pairs; they can be applied during decoding or rescoring. The approach yields a metric called “distortion perplexity” (“disperp”) for comparing SCMs offline on test data, analogous to perplexity for language models. A decision-tree-based SCM is tested on Chinese-to-English translation, and outperforms a baseline distortion penalty approach at the 99% confidence level. |
---|
Publication date | 2006 |
---|
In | |
---|
Language | English |
---|
NRC number | NRCC 48752 |
---|
NPARC number | 5763206 |
---|
Export citation | Export as RIS |
---|
Report a correction | Report a correction (opens in a new tab) |
---|
Record identifier | d91d8d1e-2ad7-4710-9e57-1b902811e2a1 |
---|
Record created | 2009-03-29 |
---|
Record modified | 2020-10-09 |
---|