Téléchargement | - Voir le manuscrit accepté : Cost-sensitive self-training (PDF, 284 Kio)
|
---|
DOI | Trouver le DOI : https://doi.org/10.1007/978-3-642-30353-1_7 |
---|
Auteur | Rechercher : Guo, Yuanyuan; Rechercher : Zhang, Harry; Rechercher : Spencer, Bruce1 |
---|
Affiliation | - Conseil national de recherches du Canada. Technologies de l'information et des communications
|
---|
Format | Texte, Chapitre de livre |
---|
Conférence | 25th Canadian Conference on Artificial Intelligence (Canadian AI 2012), May 28-30, 2012, Toronto, Ontario, Canada |
---|
Sujet | self-training; cost-sensitive; Naive Bayes |
---|
Résumé | In some real-world applications, it is time-consuming or expensive to collect much labeled data, while unlabeled data is easier to obtain. Many semi-supervised learning methods have been proposed to deal with this problem by utilizing the unlabeled data. On the other hand, on some datasets, misclassifying different classes causes different costs, which challenges the common assumption in classification that classes have the same misclassification cost. For example, misclassifying a fraud as a legitimate transaction could be more serious than misclassifying a legitimate transaction as fraudulent. In this paper, we propose a cost-sensitive self-training method (CS-ST) to improve the performance of Naive Bayes when labeled instances are scarce and different misclassification errors are associated with different costs. CS-ST incorporates the misclassification costs into the learning process of self-training, and approximately estimates the misclassification error to help select unlabeled instances. Experiments on 13 UCI datasets and three text datasets show that, in terms of the total misclassification cost and the number of correctly classified instances with higher costs, CS-ST has better performance than the self-training method and the base classifier learned from the original labeled data only. |
---|
Date de publication | 2012-05-30 |
---|
Maison d’édition | Springer Berlin Heidelberg |
---|
Dans | |
---|
Série | |
---|
Langue | anglais |
---|
Publications évaluées par des pairs | Oui |
---|
Numéro NPARC | 20255955 |
---|
Exporter la notice | Exporter en format RIS |
---|
Signaler une correction | Signaler une correction (s'ouvre dans un nouvel onglet) |
---|
Identificateur de l’enregistrement | 837610f2-a9d8-400f-9edf-f93bce3f74d5 |
---|
Enregistrement créé | 2012-07-07 |
---|
Enregistrement modifié | 2020-03-03 |
---|