学位论文详细信息
A theory of (almost) zero resource speech recognition
Speech recognition;Unsupervised learning;PAC-Bayesian theory;Language Modeling;Acoustic Event Detection;anomaly detection
Bharadwaj, Sujeeth Subramanya
关键词: Speech recognition;    Unsupervised learning;    PAC-Bayesian theory;    Language Modeling;    Acoustic Event Detection;    anomaly detection;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/78343/BHARADWAJ-DISSERTATION-2015.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

Automatic speech recognition has matured into a commercially successful technology, enabling voice-based interfaces for smartphones, smart TVs, and many other consumer devices.The overwhelming popularity, however, is still limited to languages such as English, Japanese, and German, where vast amounts of labeled training data are available.For most other languages, it is prohibitively expensive to 1) collect and transcribe the speech data required to learn good acoustic models; and 2) acquire adequate text to estimate meaningful language models.A theory of unsupervised and semi-supervised techniques for speech recognition is therefore essential.This thesis focuses on HMM-based sequence clustering and examines acoustic modeling, language modeling, and applications beyond the components of an ASR, such as anomaly detection, from the vantage point of PAC-Bayesian theory.The first part of this thesis extends standard PAC-Bayesian bounds to address the sequential nature of speech and language signals.A novel algorithm, based on sparsifying the cluster assignment probabilities with a Renyi entropy prior, is shown to provably minimize the generalization error of any probabilistic model (e.g. HMMs).The second part examines application-specific loss functions such as cluster purity and perplexity.Empirical results on a variety of tasks -- acoustic event detection, class-based language modeling, and unsupervised sequence anomaly detection -- confirm the practicality of the theory and algorithms developed in this thesis.

【 预 览 】
附件列表
Files Size Format View
A theory of (almost) zero resource speech recognition 1335KB PDF download
  文献评价指标  
  下载次数:15次 浏览次数:57次