Download | - View accepted manuscript: A probabilistic model for fast and confident categorisation of textual documents (PDF, 325 KiB)
|
---|
Author | Search for: Goutte, Cyril1 |
---|
Affiliation | - National Research Council of Canada. NRC Institute for Information Technology
|
---|
Format | Text, Book Chapter |
---|
Abstract | We describe the National Research Council's (NRC) entry in the Anomaly Detection/Text Mining competition organized at the Text Mining Workshop 2007. This entry relies on a straightforward implementation of a probabilistic categorizer described earlier [GGPC02]. This categorizer is adapted to handle multiple labeling and a piecewise-linear confidence estimation layer is added to provide an estimate of the labeling confidence. This technique achieves a score of 1.689 on the test data. This model has potentially useful features and extensions such as the use of a category-specific decision layer or the extraction of descriptive category keywords from the probabilistic profile. |
---|
Publication date | 2008 |
---|
Publisher | Springer |
---|
Place | Oxford |
---|
In | |
---|
Language | English |
---|
NRC number | NRCC 49829 |
---|
NPARC number | 5764844 |
---|
Export citation | Export as RIS |
---|
Report a correction | Report a correction (opens in a new tab) |
---|
Record identifier | 05e3038a-f734-4b14-bcc4-d90f41df31e8 |
---|
Record created | 2009-03-29 |
---|
Record modified | 2024-02-05 |
---|