The unreasonable effectiveness of word representations for Twitter named entity recognition

DOI	Trouver le DOI : https://doi.org/10.3115/v1/N15-1075
Auteur	Rechercher : Cherry, Colin¹; Rechercher : Guo, Hongyu¹
Affiliation	Conseil national de recherches Canada. Technologies de l'information et des communications
Format	Texte, Article
Conférence	2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, May 31-June 5,2015, Denver, Colorado, USA
Résumé	Named entity recognition (NER) systems trained on newswire perform very badly when tested on Twitter. Signals that were reliable in copy-edited text disappear almost entirely in Twitter’s informal chatter, requiring the construction of specialized models. Using well understood techniques, we set out to improve Twitter NER performance when given a small set of annotated training tweets. To leverage unlabeled tweets, we build Brown clusters and word vectors, enabling generalizations across distributionally similar words. To leverage annotated newswire data, we employ an importance weighting scheme. Taken all together, we establish a new state-of-the-art on two common test sets. Though it is wellknown that word representations are useful for NER, supporting experiments have thus far focused on newswire data. We emphasize the effectiveness of representations on Twitter NER, and demonstrate that their inclusion can improve performance by up to 20 F1.
Date de publication	2015-05-31
Maison d’édition	Association for Computational Linguistics
Dans	Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, N15-1075 : 735–745. http://aclweb.org/anthology/N/N15/.
Langue	anglais
Publications évaluées par des pairs	Oui
Numéro NPARC	23000026
Exporter la notice	Exporter en format RIS
Signaler une correction	Signaler une correction (s'ouvre dans un nouvel onglet)
Identificateur de l’enregistrement	e5c6c417-b6ac-46e9-bbb5-9b51f3ed233b
Enregistrement créé	2016-05-30
Enregistrement modifié	2020-04-22

Détails de la page

Par :

Conseil national de recherches Canada

Date de modification :: 2026-04-18