Auteur | Rechercher : Carpuat, Marine1; Rechercher : Daume III, Hal; Rechercher : Henry, Katie; Rechercher : Irvine, Ann; Rechercher : Jagarlamudi, Jagadeesh; Rechercher : Rudinger, Rachel |
---|
Affiliation | - Conseil national de recherches du Canada. Technologies de l'information et des communications
|
---|
Format | Texte, Article |
---|
Conférence | 51st Annual Meeting of the Association for Computational Linguistics, August 4-9 2013, Sofia, Bulgaria |
---|
Résumé | Words often gain new senses in new domains. Being able to automatically identify, from a corpus of monolingual text, which word tokens are being used in a previously unseen sense has applications to machine translation and other tasks sensitive to lexical semantics. We define a task, SenseSpotting, in which we build systems to spot tokens that have new senses in new domain text. Instead of difficult and expensive annotation, we build a goldstandard by leveraging cheaply available parallel corpora, targeting our approach to the problem of domain adaptation for machine translation. Our system is able to achieve F-measures of as much as 80%, when applied to word types it has never seen before. Our approach is based on a large set of novel features that capture varied aspects of how words change when used in new domains. |
---|
Date de publication | 2013 |
---|
Maison d’édition | Association for Computational Linguistics |
---|
Dans | |
---|
Langue | anglais |
---|
Publications évaluées par des pairs | Oui |
---|
Numéro NPARC | 23000603 |
---|
Exporter la notice | Exporter en format RIS |
---|
Signaler une correction | Signaler une correction (s'ouvre dans un nouvel onglet) |
---|
Identificateur de l’enregistrement | 559b8e7b-80bf-4aec-a2a0-33ddb4572af4 |
---|
Enregistrement créé | 2016-08-04 |
---|
Enregistrement modifié | 2020-04-22 |
---|