ETRI Journal | |
Adaptive Channel Normalization Based on Infomax Algorithm for Robust Speech Recognition | |
关键词: information-maximization method; blind decorrelation; RASTA-like filtering; adaptive channel normalization; Robust speech recognition; | |
Others : 1185556 DOI : 10.4218/etrij.07.0506.0031 |
|
【 摘 要 】
This paper proposes a new data-driven method for high-pass approaches, which suppresses slow-varying noise components. Conventional high-pass approaches are based on the idea of decorrelating the feature vector sequence, and are trying for adaptability to various conditions. The proposed method is based on temporal local decorrelation using the information-maximization theory for each utterance. This is performed on an utterance-by-utterance basis, which provides an adaptive channel normalization filter for each condition. The performance of the proposed method is evaluated by isolated-word recognition experiments with channel distortion. Experimental results show that the proposed method yields outstanding improvement for channel-distorted speech recognition.
【 授权许可】
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150520112301130.pdf | 264KB | download |
【 参考文献 】
- [1]H. Hermansky and N. Morgan, "RASTA Processing of Speech," IEEE Transactions on Speech and Audio Processing, vol. 2, 1994, pp. 578-589.
- [2]H.G. Hirsch, P. Meyer, and H.W. Ruehl, "Improved Speech Recognition Using High-Pass Filtering of Subband Envelopes," Proceeding of the European Conference on Speech Communication and Technology, 1991, pp. 413-416.
- [3]C. Nadeu, P. Paches-Leal, and B.-H. Juang, "Filtering the Time Sequence of Spectral Parameters for Speaker-Independent CDHMM Word Recognition," Proceeding of the European Conference on Speech Communication and Technology, 1995, pp. 923-926.
- [4]A.J. Bell and T.J. Sejnowski, "An Information-Maximisation Approach to Blind Separation and Blind Deconvolution," Neural Computation, vol. 7, 1995, pp. 1129-1159.
- [5]H.H. Yang and S. Amari, "Adaptive On-Line Learning Algorithms for Blind Separation-Maximum Entropy and Minimum Mutual Information," Neural Computation, vol. 9, 1997, pp. 1457-1482.
- [6]A. Papoulis, Probability, Random Variables, and Stochastic Processes, McGraw-Hill, 1991.
- [7]H. Bourlard, H. Hermansky, and N. Morgan, "Towards Increasing Speech Recognition Error Rates," Speech Communication, vol. 18, 1996, pp. 205-231.
- [8]C.-P. Chen, K. Filali, and J.A. Bilmes, "Frontend Post-Processing and Backend Model Enhancement on the AURORA 2.0/3.0 Databases," Proceeding of ICSLP, 2002, pp. 241-244.
- [9]H.-Y. Jung, "Filtering of Filter-Bank Energies for Robust Speech Recognition," ETRI Journal, vol. 26, 2004, pp. 273-276.