Semantic Similarity for Detecting Recognition Errors in Automatic Speech Transcripts

From National Research Council Canada

Download	View accepted manuscript: Semantic Similarity for Detecting Recognition Errors in Automatic Speech Transcripts (PDF, 299 KiB)
Author	Search for: Inkpen, D.; Search for: Désilets, Alain
Format	Text, Article
Conference	Conference on Empirical Methods in Natural Language Processing (EMNLP 2005), October 6-8, 2005, Vancouver, British Columbia, Canada
Abstract	Browsing through large volumes of spoken audio is known to be a challenging task for end users. One way to alleviate this problem is to allow users to gist a spoken audio document by glancing over a transcript generated through Automatic Speech Recognition. Unfortunately, such transcripts typically contain many recognition errors which are highly distracting and make gisting more difficult. In this paper we present an approach that detects recognition errors by identifying words which are semantic outliers with respect to other words in the transcript. We describe several variants of this approach. We investigate a wide range of evaluation measures and we show that we can significantly reduce the number of errors in content words, with the trade-off of losing some good content words.
Publication date	2005
In	Conference on Empirical Methods in Natural Language Processing (EMNLP 2005) [Proceedings].
Language	English
NRC number	NRCC 48278
NPARC number	5765538
Export citation	Export as RIS
Report a correction	Report a correction (opens in a new tab)
Record identifier	9a85aa3d-1412-49f9-a03e-af0ea3c260f9
Record created	2009-03-29
Record modified	2020-10-09

Date modified:: 2024-07-27