A comparison of data-driven automatic syllabification methods

Par Conseil national de recherches du Canada

DOI	Trouver le DOI : https://doi.org/10.1007/978-3-642-03784-9_17
Auteur	Rechercher : Adsett, Connie R.¹; Rechercher : Marchand, Yannick¹
Affiliation	Conseil national de recherches du Canada. Institut du biodiagnostic du CNRC
Format	Texte, Article
Sujet	natural language processing; machine learning; automatic syllabification
Résumé	Although automatic syllabification is an important component in several natural language tasks, little has been done to compare the results of data-driven methods on a wide range of languages. This article compares the results of five data-driven syllabification algorithms (Hidden Markov Support Vector Machines, IB1, Liang’s algorithm, the Look Up Procedure, and Syllabification by Analogy) on nine European languages in order to determine which algorithm performs best over all. Findings show that all algorithms achieve a mean word accuracy across all lexicons of over 90%. However, Syllabification by Analogy performs better than the other algorithms tested with a mean word accuracy of 96.84% (standard deviation of 2.93) whereas Liang’s algorithm, the standard for hyphenation (used in \TeX), produces the second best results with a mean of 95.67% (standard deviation of 5.70).
Date de publication	2009
Maison d’édition	Springer
Dans	String Processing and Information Retrieval : 174–181.
Série	Lecture Notes in Computer Science, nº 5721.
Langue	anglais
Publications évaluées par des pairs	Oui
Numéro NPARC	23004406
Exporter la notice	Exporter en format RIS
Signaler une correction	Signaler une correction (s'ouvre dans un nouvel onglet)
Identificateur de l’enregistrement	97d91e26-f374-4422-b4ff-5f0afe441a54
Enregistrement créé	2018-10-31
Enregistrement modifié	2020-04-16

Date de modification :: 2024-07-23