Download | - View accepted manuscript: Keyphrase Extraction: Enhancing Lists (PDF, 232 KiB)
|
---|
Author | Search for: Barrière, Caroline; Search for: Jarmasz, Mario |
---|
Format | Text, Article |
---|
Conference | Computational Linguistic in the North-East (CLINE'2004), August 30, 2004, Montréal, Québec, Canada |
---|
Subject | keyphrase extraction; clustering; semantic similarity; corpus linguistics; keyphrase evaluation |
---|
Abstract | This paper proposes some modest improvements to Extractor, a state-of-the-art keyphrase extraction system, by using a terabyte-sized corpus to estimate the informativeness and semantic similarity of keyphrases. We present two techniques to improve the organization and remove outliers of lists of keyphrases. The first is a simple ordering according to their occurrences in the corpus; the second is clustering according to semantic similarity. Evaluation issues are discussed. We present a novel technique of comparing extracted keyphrases to a gold standard which relies on semantic similarity rather than string matching or an evaluation involving human judges. |
---|
Publication date | 2004 |
---|
In | |
---|
Language | English |
---|
NRC number | NRCC 48079 |
---|
NPARC number | 5765134 |
---|
Export citation | Export as RIS |
---|
Report a correction | Report a correction (opens in a new tab) |
---|
Record identifier | bbdcb1d3-d36b-4f4f-9f56-2a613f0f4310 |
---|
Record created | 2009-03-29 |
---|
Record modified | 2021-01-05 |
---|