学位论文详细信息
Voice query-by-example for resource-limited languages using an ergodic hidden Markov model of speech
Speech recognition;Hidden Markov model
Ali, Asif ; Clements, Mark A. Electrical and Computer Engineering Lee, Chin-Hui Anderson, David V. Copeland, John Lerch, Alexander ; Clements, Mark A.
University:Georgia Institute of Technology
Department:Electrical and Computer Engineering
关键词: Speech recognition;    Hidden Markov model;   
Others  :  https://smartech.gatech.edu/bitstream/1853/50363/1/ALI-DISSERTATION-2013.pdf
美国|英语
来源: SMARTech Repository
PDF
【 摘 要 】

An ergodic hidden Markov model (EHMM) can be useful in extracting underlying structure embedded in connected speech without the need for a time-aligned transcribed corpus.In this research, we present a query-by-example (QbE) spoken term detection system based on an ergodic hidden Markov model of speech. An EHMM-based representation of speech is not invariant to speaker-dependent variations due to the unsupervised nature of the training. Consequently, a single phoneme may be mapped to a number of EHMM states. The effects of speaker-dependent and context-induced variation in speech on its EHMM-based representation have been studied and used to devise schemes to minimize these variations. Speaker-invariance can be introduced into the system by identifying states with similar perceptual characteristics. In this research, two unsupervised clustering schemes have been proposed to identify perceptually similar states in an EHMM. A search framework, consisting of a graphical keyword modeling scheme and a modified Viterbi algorithm, has also been implemented.An EHMM-based QbE system has been compared to the state-of-the-art and has been demonstrated to have higher precisions than those based on static clustering schemes.

【 预 览 】
附件列表
Files Size Format View
Voice query-by-example for resource-limited languages using an ergodic hidden Markov model of speech 1380KB PDF download
  文献评价指标  
  下载次数:34次 浏览次数:56次