Download | - View accepted manuscript: Text categorization for an online tendering system (PDF, 573 KiB)
|
---|
Author | Search for: Wang, Y.; Search for: Zhang, H.; Search for: Spencer, B.1; Search for: Yan, Y.1 |
---|
Affiliation | - National Research Council of Canada. NRC Institute for Information Technology
|
---|
Format | Text, Article |
---|
Conference | Business Agents and Semantic Web Workshop (BASeWEB 04), May 2004, London, Canada |
---|
Subject | text categorization; machine learning; Rocchio method; TF-IDF; WIDF; weighted inverse document frequency; naive Bayes classifier; ranking categorization |
---|
Abstract | This paper investigates the application of text categoriza- tion (TC) in a setting exhibiting a large number of target categories with relatively few training cases, applied to a real-life online tendering system. This is an experiment paper showing our experiences in dealing with a real- life application using the conventional machine learning approaches for TC, namely, the Rocchio method, TF-IDF (term frequency-inverse document fre- quency), WIDF (weighted inverse document frequency), and naijve Bayes. In order to make the categorization results acceptable for industrial use, we made use of the hierarchical structure of the target categories and investi- gated the semi-automated ranking categorization. |
---|
Publication date | 2004-05-01 |
---|
In | |
---|
Language | English |
---|
Peer reviewed | Yes |
---|
NPARC number | 21260516 |
---|
Export citation | Export as RIS |
---|
Report a correction | Report a correction (opens in a new tab) |
---|
Record identifier | a3b8b396-a184-43b9-b229-d47c4e95ed5c |
---|
Record created | 2013-03-05 |
---|
Record modified | 2020-06-04 |
---|