| Author | Search for: Carpuat, Marine1; Search for: Daume III, Hal; Search for: Henry, Katie; Search for: Irvine, Ann; Search for: Jagarlamudi, Jagadeesh; Search for: Rudinger, Rachel |
|---|
| Affiliation | - National Research Council Canada. Information and Communication Technologies
|
|---|
| Format | Text, Article |
|---|
| Conference | 51st Annual Meeting of the Association for Computational Linguistics, August 4-9 2013, Sofia, Bulgaria |
|---|
| Abstract | Words often gain new senses in new domains. Being able to automatically identify, from a corpus of monolingual text, which word tokens are being used in a previously unseen sense has applications to machine translation and other tasks sensitive to lexical semantics. We define a task, SenseSpotting, in which we build systems to spot tokens that have new senses in new domain text. Instead of difficult and expensive annotation, we build a goldstandard by leveraging cheaply available parallel corpora, targeting our approach to the problem of domain adaptation for machine translation. Our system is able to achieve F-measures of as much as 80%, when applied to word types it has never seen before. Our approach is based on a large set of novel features that capture varied aspects of how words change when used in new domains. |
|---|
| Publication date | 2013 |
|---|
| Publisher | Association for Computational Linguistics |
|---|
| In | |
|---|
| Language | English |
|---|
| Peer reviewed | Yes |
|---|
| NPARC number | 23000603 |
|---|
| Export citation | Export as RIS |
|---|
| Report a correction | Report a correction (opens in a new tab) |
|---|
| Record identifier | 559b8e7b-80bf-4aec-a2a0-33ddb4572af4 |
|---|
| Record created | 2016-08-04 |
|---|
| Record modified | 2020-04-22 |
|---|