Téléchargement | - Voir la version finale : Transfer learning improves french cross-domain dialect identification: NRC @ VarDial 2022 (PDF, 319 Kio)
|
---|
Auteur | Rechercher : Bernier-Colborne, Gabriel1; Rechercher : Leger, Serge1; Rechercher : Goutte, Cyril1 |
---|
Affiliation | - Conseil national de recherches du Canada. Technologies numériques
|
---|
Format | Texte, Article |
---|
Conférence | Ninth Workshop on NLP for Similar Languages, Varieties and Dialects, October 2022, Gyeongju, Republic of Korea |
---|
Résumé | We describe the systems developed by the National Research Council Canada for the French Cross-Domain Dialect Identification shared task at the 2022 VarDial evaluation campaign. We evaluated two different approaches to this task: SVM and probabilistic classifiers exploiting n-grams as features, and trained from scratch on the data provided; and a pre-trained French language model, CamemBERT, that we fine-tuned on the dialect identification task. The latter method turned out to improve the macro-F1 score on the test set from 0.344 to 0.430 (25% increase), which indicates that transfer learning can be helpful for dialect identification. |
---|
Date de publication | 2022-10-06 |
---|
Maison d’édition | Association for Computational Linguistics |
---|
Licence | |
---|
Dans | |
---|
Langue | anglais |
---|
Publications évaluées par des pairs | Oui |
---|
Exporter la notice | Exporter en format RIS |
---|
Signaler une correction | Signaler une correction (s'ouvre dans un nouvel onglet) |
---|
Identificateur de l’enregistrement | 7d0c4e22-ed47-4519-a0d3-0f1c1b25b516 |
---|
Enregistrement créé | 2022-10-19 |
---|
Enregistrement modifié | 2022-10-21 |
---|