科技报告详细信息
Approaches to Reduce the Effects of OOV Queries on Indexed
Logan, Beth ; Moreno, Pedro ; Van Thong, JM
HP Development Company
关键词: spoken document retrieval;    speech indexing;    out-of- vocabulary words;    oov words;   
RP-ID  :  HPL-2003-46
学科分类:计算机科学(综合)
美国|英语
来源: HP Labs
PDF
【 摘 要 】

We present several novel approaches to the OOV query problem for spoken audio: indexing based on syllable- like units called particles and query expansion according to acoustic confusability for a word index. We also examine linear and OOV-based combination of indexing schemes. We experiment on 75 hours of broadcast news, comparing our approaches to a word index, a phoneme index and a phoneme index queried with phoneme sequences. Our results show that our approaches are superior to both a word index and a phoneme index for OOV words, and have comparable performance to the sequence of phonemes scheme. The particle system has worse performance than the acoustic query expansion scheme. The best system uses word queries for in-vocabulary words and a linear combination of the phoneme sequence scheme and acoustic query expansion for OOV words. This system improved the average precision from 0.35 for a word index to 0.40. Notes: Portions of this work were based on papers published in Human Language Technology Conference 24-27 March 2002, San Diego, CA and in the International Conference on Spoken Language Processing, September 2002, Denver, Colorado 17 Pages

【 预 览 】
附件列表
Files Size Format View
RO201804100001661LZ 105KB PDF download
  文献评价指标  
  下载次数:17次 浏览次数:74次