Abstract | This research proposes a comparison of two sources of information for building a specialized ontology: the WWW, a large repository of uncategorized texts, and BioMed, a small specialized corpus in the medical domain. The methodology explored is the use of knowledge patterns. These are explicit markers in text leading to semantic or conceptual relations. Although the method developed has interest for discovering new information in order to enrich the UMLS (a biomedical metathesaurus), we measure its success by an attempt to “rediscover” information already present in the UMLS Metathesaurus. Measures of precision and recall are used in several experiments of instance retrieval for four semantic relations important in the UMLS Methathesaurus, two of a general nature (is-a, synonymy) and two domain specific ones (preventing, inducing). Results show that although the WWW is a noisy repository, its exploration has potential and does allow the discovery of valuable specialized knowledge. |
---|