Téléchargement | - Voir le manuscrit accepté : The NRC System for Discriminating Similar Languages (PDF, 533 Kio)
|
---|
Auteur | Rechercher : Goutte, Cyril1; Rechercher : Léger, Serge1; Rechercher : Carpuat, Marine1 |
---|
Affiliation du nom | - Conseil national de recherches du Canada. Technologies de l'information et des communications
|
---|
Format | Texte, Article |
---|
Conférence | First Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects, August 23-29, 2014, Dublin, Ireland |
---|
Résumé | We describe the system built by the National Research Council Canada for the ”Discriminating between similar languages” (DSL) shared task. Our system uses various statistical classifiers and makes predictions based on a two-stage process: we first predict the language group, then discriminate between languages or variants within the group. Language groups are predicted using a generative classifier with 99.99% accuracy on the five target groups. Within each group (except English), we use a voting combination of discriminative classifiers trained on a variety of feature spaces, achieving an average accuracy of 95.71%, with per-group accuracy between 90.95% and 100% depending on the group. This approach turns out to reach the best performance among all systems submitted to the open and closed tasks. |
---|
Date de publication | 2014-08-23 |
---|
Dans | |
---|
Langue | anglais |
---|
Numéro NPARC | 21275282 |
---|
Exporter la notice | Exporter en format RIS |
---|
Signaler une correction | Signaler une correction (s'ouvre dans un nouvel onglet) |
---|
Identificateur de l’enregistrement | bd4a662e-ed67-47ef-8165-abde04de494c |
---|
Enregistrement créé | 2015-05-28 |
---|
Enregistrement modifié | 2020-06-04 |
---|