| Téléchargement | - Voir la version finale : OCR evaluation tools for the 21st century (PDF, 238 Kio)
|
|---|
| Auteur | Rechercher : Santos, Eddie Antonio1 |
|---|
| Affiliation | - Conseil national de recherches Canada. Technologies numériques
|
|---|
| Format | Texte, Article |
|---|
| Conférence | 3rd Workshop on the Use of Computational Methods in the Study of Endangered Languages, February 26-27, 2019, Honolulu, Hawaii |
|---|
| Résumé | We introduce ocreval, a port of the ISRI OCR Evaluation Tools, now with Unicode support. We describe how we upgraded the ISRI OCR Evaluation Tools to support modern text processing tasks. ocreval supports producing character-level and word-level accuracy reports, supporting all characters representable in the UTF-8 character encoding scheme. In addition, we have implemented the Unicode default word boundary specification in order to support word-level accuracy reports for a broad range of writing systems. We argue that character-level and word-level accuracy reports produce confusion matrices that are useful for tasks beyond OCR evaluation— including tasks supporting the study and computational modeling of endangered languages. |
|---|
| Date de publication | 2019-02 |
|---|
| Maison d’édition | Association for Computational Linguistics |
|---|
| Licence | |
|---|
| Dans | |
|---|
| Langue | anglais |
|---|
| Publications évaluées par des pairs | Oui |
|---|
| Exporter la notice | Exporter en format RIS |
|---|
| Signaler une correction | Signaler une correction (s'ouvre dans un nouvel onglet) |
|---|
| Identificateur de l’enregistrement | 9ed97177-7f1d-4955-b2c3-b11bd4416187 |
|---|
| Enregistrement créé | 2022-07-29 |
|---|
| Enregistrement modifié | 2022-07-29 |
|---|