The main goal of this research is to do source separation of single-channel mixed signals such that we get a clean representation of each source. In our case, we are concerned specifically with separating speech of a speaker from background noise as another source. So we deal with single-channel mixtures of speech with stationary, semi-stationary and non-stationary noise types. This is what we define as speech denoising. Our goal is to build a system to which we input a noisy speech signal and get the clean speech out with as little distortion or artifacts as possible. The model requires no prior information about the speaker or the background noise. The separation is done in real-time as we can feed the input signal on a frame-by-frame basis. This model can be used in speech recognition systems to improve recognition accuracy in noisy environments. Two methods were mainly adopted for this purpose, nonnegative matrix factorization (NMF) and neural networks. Experiments were conducted to compare the performance of these two methods for speech denoising. For each of these methods, we compared the performance of the case where we had prior information of both the speaker and noise to having just a general speech dictionary. Also, some experiments were conducted to compare the different architectures and parameters in each of these approaches.
【 预 览 】
附件列表
Files
Size
Format
View
Speech denoising using nonnegative matrix factorization and neural networks