Theoretical Analyses of Cross-Validation Error and Voting in Instance-Based Learning

From National Research Council Canada

Download	View accepted manuscript: Theoretical Analyses of Cross-Validation Error and Voting in Instance-Based Learning (PDF, 642 KiB)
Author	Search for: Turney, Peter¹
Affiliation	National Research Council of Canada. NRC Institute for Information Technology
Format	Text, Article
Subject	cross validation; curve fitting; AIC; bias; variance; ajustement de courbe; AIC; biais; variance
Abstract	This paper begins with a general theory of error in cross-validation testing ofalgorithms for supervised learning from examples. It is assumed that the examples are described by attribute-value pairs, where the values are symbolic. Cross-validation requires a set of training examples and a set of testing examples. The value of the attribute that is to be predicted is known to the learner in the training set, but unknown in the testing set. The theory demonstrates that cross-validation error has two components: error on the training set (inaccuracy) and sensitivity to noise (instability). This general theory is then applied to voting in instance-based learning. Given an example in the testing set, a typical instance-based learning algorithm predicts the designated attribute by voting among the k nearest neighbors (the k most similar examples) to the testing example in the training set. Voting is intended to increase the stability (resistance to noise) of instance-based learning, but a theoretical analysis shows that there are circumstances in which voting can be destabilizing. The theory suggests ways to minimize cross-validation error, by insuring that voting is stable and does not adversely affect accuracy.
Publication date	1994
In	Journal of Experimental and Theoretical Artificial Intelligence (JETAI) 6 (1994).
Language	English
NRC number	NRCC 35073
NPARC number	8914390
Export citation	Export as RIS
Report a correction	Report a correction (opens in a new tab)
Record identifier	82cd15e5-9759-403c-b307-f0ea24165970
Record created	2009-04-22
Record modified	2020-04-27

Date modified:: 2024-07-08