National Research Council of Canada. NRC Institute for Information Technology
Seventh SIAM International Conference on Data Mining (SDM 2007), April 28, 2007, Minneapolis, MN
We describe NRC's submission to the Anomaly Detection/Text Mining competition organised at the Text Mining Workshop 2007. This submission relies on a straightforward implementation of the probabilistic categoriser described in [Gaussier et al., 2002]. This categoriser is adapted to handle multiple labelling and a piecewise-linear confidence estimation layer is added to provide an estimate of the labelling confidence. This technique achieves a score of 1.689 on the test data.
Seventh SIAM International Conference on Data Mining (SDM 2007) [Proceedings].