Semantic distance measures with distributional profiles of coarse-grained concepts

From National Research Council Canada

DOI	Resolve DOI: https://doi.org/10.1007/978-3-642-22613-7_4
Author	Search for: Hirst, Graeme; Search for: Mohammad, Saif¹
Affiliation	National Research Council of Canada. NRC Institute for Information Technology
Format	Text, Book Chapter
Abstract	Although semantic distance measures are applied to words in textual tasks such as building lexical chains, semantic distance is really a property of concepts, not words. After discussing the limitations of measures based solely on lexical resources such as WordNet or solely on distributional data from text corpora, we present a hybrid measure of semantic distance based on distributional profiles of concepts that we infer from corpora. We use only a very coarse-grained inventory of concepts - each category of a published thesaurus is taken as a single concept - and yet we obtain results on basic semantic-distance tasks that are better than those of methods that use only distributional data and are generally as good as those that use fine-grained WordNet-based measures. Because the measure is based on naturally occurring text, it is able to find word pairs that stand in non-classical relationships not found in WordNet. It can be applied cross-lingually, using a thesaurus in one language to measure semantic distance between words in another. In addition, we show the use of the method in determining the degree of antonymy of word pairs.
Publication date	2012-01-01
In	Modeling, Learning, and Processing of Text Technological Data Structures (1 January 2012): 61–79.
Series	Studies in Computational Intelligence, no. 370.
Language	English
Peer reviewed	Yes
NPARC number	21271466
Export citation	Export as RIS
Report a correction	Report a correction (opens in a new tab)
Record identifier	e9cbbcab-c192-4cab-8ec7-f869c70b7fca
Record created	2014-03-24
Record modified	2020-03-03

Date modified:: 2024-07-08