期刊论文详细信息
Applied Sciences 卷:12
Noise Modeling to Build Training Sets for Robust Speech Enhancement
Zhou Wu1  Xinxin Kong1  Wenxi Zhang1  Yongbiao Wang1  Hongxin Zhang2  Yahui Wang2 
[1] Key Laboratory of Computational Optical Imaging Technology, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100081, China;
[2] School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing 100876, China;
关键词: Noise Modeling;    training set;    Generative Adversarial Network;    speech enhancement;   
DOI  :  10.3390/app12041905
来源: DOAJ
【 摘 要 】

DNN-based Speech Enhancement (SE) models suffer from significant performance degradation in real recordings due to the mismatch between the synthetic datasets employed for training and real test sets. To solve this problem, we propose a new Generative Adversarial Network framework for Noise Modeling (NM-GAN) that creates realistic paired training sets by imitating real noise distribution. The proposed framework combines a novel 7-layer U-Net with two bidirectional long short-term memory (LSTM) layers that act as a generator to construct complex noise. NM-GAN generates enough recall (diversity) and precision (noise quality) in its samples through adversarial and alternate training, effectively simulating real noise, which is then utilized to compose realistic paired training sets. Extensive experiments employing various qualitative and quantitative evaluation metrics verify the effectiveness of the generated noise samples and training sets, demonstrating our framework’s capabilities.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:4次