ETRI Journal | |
Filtering of Filter-Bank Energies for Robust Speech Recognition | |
关键词: Robust Feature Extraction; Speech Recognition; | |
Others : 1185158 DOI : 10.4218/etrij.04.0203.0033 |
|
【 摘 要 】
We propose a novel feature processing technique which can provide a cepstral liftering effect in the log-spectral domain. Cepstral liftering aims at the equalization of variance of cepstral coefficients for the distance-based speech recognizer, and as a result, provides the robustness for additive noise and speaker variability. However, in the popular hidden Markov model based framework, cepstral liftering has no effect in recognition performance. We derive a filtering method in log-spectral domain corresponding to the cepstral liftering. The proposed method performs a high-pass filtering based on the decorrelation of filter-bank energies. We show that in noisy speech recognition, the proposed method reduces the error rate by 52.7% to conventional feature.
【 授权许可】
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150520105008548.pdf | 147KB | download |
【 参考文献 】
- [1]S.B. Davis and P. Mermelstein, "Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences," IEEE Trans. ASSP, vol. 28, Aug. 1980, pp. 357-366.
- [2]Ho-Young Jung, Mansoo Park, Hoi-Rin Kim, and Minsoo Hahn, "Speaker Adaptation Using ICA-Based Feature Transformation," ETRI J., vol. 24, no. 6, Dec. 2002, pp. 469-472.
- [3]C. Nadeu, J. Hernando, and M. Gorricho, "On the Decorrelation of Filter-Bank Energies in Speech Recognition," Proc. Eurospeech, 1995, pp. 1381-1384.
- [4]K.K. Paliwal, "Decorrelated and Liftered Filter-Bank Energies for Robust Speech Recognition," Proc. Eurospeech, Budapest, Hungary, Sept. 1999, pp. 85-88.
- [5]C. Nadeu, D. Macho, and J. Hernando, "Time and Frequency Filtering of Filter-Bank Energies for Robust HMM Speech Recognition," Speech Communication, vol. 34, Apr. 2001, pp. 93-114.
- [6]B.-H. Juang, L.R. Rabiner, and J.G. Wilpon, "On the Use of Bandpass Liftering in Speech Recognition," IEEE Trans. ASSP, vol. 35, July 1987, pp. 947-954.
- [7]J. Chen, K.K. Paliwal, and S. Nakamura, "Cepstrum Derived from Differentiated Power Spectrum for Robust Speech Recognition," Speech Communication, vol. 41, Oct. 2003, pp. 469-484.
- [8]A.Vargas and H. Steeneken, "Assessment for Automatic Speech Recognition: II. NOISEX92: A Database and an Experiment to Study the Effect of Additive Noise on Speech Recognition System," Speech Communication, vol. 12, July 1993, pp. 247-251.
- [9]C. Mokbel, J. Monne, and D. Jouvet, "On-Line Adaptation of a Speech Recognizer to Variations in Telephone Line Conditions," Proc. Eurospeech, Berlin, 1993, pp. 1247-1250.