Téléchargement | - Voir le manuscrit accepté : Tightly Packed Tries: How to Fit Large Models into Memory, and Make them Load Fast, Too (PDF, 569 Kio)
|
---|
Auteur | Rechercher : Germann, Ulrich1; Rechercher : Joanis, Eric1; Rechercher : Larkin, Samuel1 |
---|
Affiliation | - Conseil national de recherches du Canada. Institut de technologie de l'information du CNRC
|
---|
Format | Texte, Article |
---|
Conférence | Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing (SETQA-NLP 2009), Boulder, CO, USA, June 05, 2009 |
---|
Résumé | We present Tightly Packed Tries (TPTs), a compact implementation of read-only, compressed trie structures with fast on-demand paging and short load times. We demonstrate the benefits of TPTs for storing n-gram back-off language models and phrase tables for statistical machine translation. Encoded as TPTs, these databases require less space than flat text file representations of the same data compressed with the gzip utility. At the same time, they can be mapped into memory quickly and be searched directly in time linear in the length of the key, without the need to decompress the entire file. The overhead for local decompression during search is marginal. |
---|
Date de publication | 2009-06-05 |
---|
Dans | |
---|
Langue | anglais |
---|
Publications évaluées par des pairs | Oui |
---|
Numéro du CNRC | NRCC 52533 |
---|
Numéro NPARC | 16435915 |
---|
Exporter la notice | Exporter en format RIS |
---|
Signaler une correction | Signaler une correction (s'ouvre dans un nouvel onglet) |
---|
Identificateur de l’enregistrement | 9eb37696-ddab-4265-9f10-e5ff2f83779a |
---|
Enregistrement créé | 2010-11-24 |
---|
Enregistrement modifié | 2020-04-16 |
---|