The Indigenous Languages Technology project at NRC Canada: an empowerment-oriented approach to developing language software

From National Research Council Canada

Download	View final version: The Indigenous Languages Technology project at NRC Canada: an empowerment-oriented approach to developing language software (PDF, 720 KiB)
DOI	Resolve DOI: https://doi.org/10.18653/v1/2020.coling-main.516
Author	Search for: Kuhn, Roland¹; Search for: Davis, Fineen¹; Search for: Désilets, Alain¹; Search for: Joanis, Eric¹; Search for: Kazantseva, Anna¹; Search for: Knowles, Rebecca¹; Search for: Littell, Patrick¹; Search for: Lothian, Delaney¹; Search for: Pine, Aidan¹; Search for: Running Wolf, Caroline¹; Search for: Santos, Eddie¹; Search for: Stewart, Darlene¹; Search for: Boulianne, Gilles; Search for: Gupta, Vishwa; Search for: Maracle, Owennatékha Brian; Search for: Martin, Akwiratékha’; Search for: Cox, Christopher; Search for: Junker, Marie-Odile; Search for: Sammons, Olivia; Search for: Torkornoo, Delasie; Search for: Thanyehténhas Brinklow, Nathan; Search for: Child, Sara; Search for: Farley, Benoît; Search for: Huggins-Daines, David; Search for: Rosenblum, Daisy; Search for: Souter, Heather
Affiliation	National Research Council of Canada. Digital Technologies
Format	Text, Article
Conference	Proceedings of the 28th International Conference on Computational Linguistics, Dec. 8-13, 2020, Barcelona, Spain (Online)
Abstract	This paper surveys the first, three-year phase of a project at the National Research Council of Canada that is developing software to assist Indigenous communities in Canada in preserving their languages and extending their use. The project aimed to work within the empowerment paradigm, where collaboration with communities and fulfillment of their goals is central. Since many of the technologies we developed were in response to community needs, the project ended up as a collection of diverse subprojects, including the creation of a sophisticated framework for building verb conjugators for highly inflectional polysynthetic languages (such as Kanyen’kéha, in the Iroquoian language family), release of what is probably the largest available corpus of sentences in a polysynthetic language (Inuktut) aligned with English sentences and experiments with machine translation (MT) systems trained on this corpus, free online services based on automatic speech recognition (ASR) for easing the transcription bottleneck for recordings of speech in Indigenous languages (and other languages), software for implementing text prediction and read-along audiobooks for Indigenous languages, and several other subprojects.
Publication date	2020-12-13
Date created	2021-02-26
Publisher	International Committee on Computational Linguistics
Place	Stroudsburg, PA, USA
Licence	Creative Commons Attribution 4.0 International (CC BY 4.0) https://creativecommons.org/licenses/by/4.0/
In	COLING 2020: Proceedings of the 28th International Conference on Computational Linguistics: 5866–5878. https://doi.org/10.18653/v1/2020.coling-main.
Language	English
Peer reviewed	Yes
Export citation	Export as RIS
Report a correction	Report a correction (opens in a new tab)
Record identifier	b0a9c380-4669-4475-b191-74d13433c11a
Record created	2021-02-26
Record modified	2021-02-27

Date modified:: 2024-09-29