期刊论文详细信息
EURASIP Journal on Audio, Speech, and Music Processing 卷:2022
Improved capsule routing for weakly labeled sound event detection
Wenwu Wang1  Haitao Li2  Shuguo Yang2 
[1] Center for Vision Speech and Signal Processing, Department of Electrical and Electronic Engineering, Faculty of Engineering and Physical Sciences, University of Surrey;
[2] College of Mathematics and Physics, Qingdao University of Science and Technology;
关键词: Polyphonic sound event detection;    Capsule network;    Weakly labeled;    Dynamic routing;   
DOI  :  10.1186/s13636-022-00239-6
来源: DOAJ
【 摘 要 】

Abstract Polyphonic sound event detection aims to detect the types of sound events that occur in given audio clips, and their onset and offset times, in which multiple sound events may occur simultaneously. Deep learning–based methods such as convolutional neural networks (CNN) achieved state-of-the-art results in polyphonic sound event detection. However, two open challenges still remain: overlap between events and prone to overfitting problem. To solve the above two problems, we proposed a capsule network-based method for polyphonic sound event detection. With so-called dynamic routing, capsule networks have the advantage of handling overlapping objects and the generalization ability to reduce overfitting. However, dynamic routing also greatly slows down the training process. In order to speed up the training process, we propose a weakly labeled polyphonic sound event detection model based on the improved capsule routing. Our proposed method is evaluated on task 4 of the DCASE 2017 challenge and compared with several baselines, demonstrating competitive results in terms of F-score and computational efficiency.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:3次