Odelowo, Babafemi ; Anderson, David V. Electrical and Computer Engineering Moore II, Elliot Bhatti, Pamela Lanterman, Aaron Vidakovic, Brani ; Anderson, David V.
Neural networks are powerful machine learning models that have, in the last few years, been applied to several audio and speech signal processing problems including speech enhancement. Although, neural network-based speech enhancement approaches have out-performed traditional model-based approaches, there remain several unanswered questions such as the most suitable network architectures, input features, training targets, and best practices for obtaining optimal results. This dissertation studies two approaches to the development of a neural network-based speech enhancement system. First, we investigate the use of the extreme learning machine, an algorithm that allows feed-forward networks to be quickly trained and provides good generalization, for speech enhancement. We then propose modifications to the extreme learning machine to increase its prediction accuracy on multivariate datasets and demonstrate the improved performance of these algorithms on several real-world datasets and in the enhancement of noisy speech. Next, with a view to obtaining improved low signal-to-noise ratio (SNR) performance, we develop a noise prediction and time domain subtraction framework for speech enhancement. We extend the development of the noise prediction framework by investigating different training targets and the use of noise-aware training methods and show using objective performance metrics that the proposed framework compares favorably with conventional speech prediction approaches in enhancing speech quality and intelligibility in both seen and unseen noise conditions.
【 预 览 】
附件列表
Files
Size
Format
View
Development of a neural network-based speech enhancement system