期刊论文详细信息
EURASIP Journal on Audio, Speech, and Music Processing
Attention mechanism combined with residual recurrent neural network for sound event detection and localization
Empirical Research
Lei Zhang1  Yulan Han2  Yuanyuan Zhang2  Chaofeng Lan2  Chao Sun2  Lirong Fu3  Meng Zhang4 
[1] Beidahuang Industry Group General Hospital, 150088, Harbin, People’s Republic of China;Department of School of Measurement and Communication Engineering, Harbin University of Science and Technology, 150080, Harbin, People’s Republic of China;Mechanical and Electrical Engineering College, Hainan University, 570228, Haikou, People’s Republic of China;School of Electronics and Communication Engineering, Guangzhou University, 510006, Guangzhou, People’s Republic of China;
关键词: Sound event;    Detection and localization;    Convolutional cyclic neural network;    Multi-scale feature fusion;    Space channel squeeze excitation module;   
DOI  :  10.1186/s13636-022-00263-6
 received in 2022-03-19, accepted in 2022-11-16,  发布年份 2022
来源: Springer
PDF
【 摘 要 】

In the task of sound event detection and localization (SEDL) in a complex environment, the acoustic signals of different events usually have nonlinear superposition, so the detection and localization effect is not good. Given this, this paper is based on the Residual-spatially and channel Squeeze-Excitation (Res-scSE) model. Combined with Multiple-scale Convolutional Recurrent Neural Network (M-CRNN), the Res-scSE-CRNN model is proposed. Firstly, to solve the problem of insufficient extraction of time-frequency feature in single-size convolution kernel, multi-scale feature fusion is carried out by using the feature hierarchy of the convolutional neural network to improve the accuracy of detection. Secondly, aiming at the problem of overlapping audio event localization accuracy is not high, with Res-scSE to replace common convolution module and add residual structure to strengthen the feature extraction, and combining with an attention mechanism to enhance neural network channels and spatial relationships, to improve the network to extract the characteristics of directivity, achieve the goal of the overlapped audio localization. In this paper, experiments are carried out in the open dataset DCASE2019, and evaluation indicators are used to analyze the effectiveness of the proposed model and baseline model in the detection and localization of audio events. The results show that compared with the M-CRNN model, the detection error rate of Res-scSE-CRNN model is reduced 4%, the F1-Score is increased 3.4%, the localization error is reduced by 22.8°, and the frame recall rate is increased 3%.

【 授权许可】

CC BY   
© The Author(s) 2022

【 预 览 】
附件列表
Files Size Format View
RO202305065303548ZK.pdf 1828KB PDF download
12982_2022_119_Article_IEq156.gif 1KB Image download
12982_2022_119_Article_IEq158.gif 1KB Image download
12982_2022_119_Article_IEq160.gif 1KB Image download
12982_2022_119_Article_IEq162.gif 1KB Image download
12982_2022_119_Article_IEq164.gif 1KB Image download
12888_2022_4451_Article_IEq2.gif 1KB Image download
MediaObjects/12888_2022_4451_MOESM1_ESM.docx 28KB Other download
12982_2022_119_Article_IEq169.gif 1KB Image download
12982_2022_119_Article_IEq171.gif 1KB Image download
12982_2022_119_Article_IEq173.gif 1KB Image download
12982_2022_119_Article_IEq176.gif 1KB Image download
12982_2022_119_Article_IEq181.gif 1KB Image download
12982_2022_119_Article_IEq182.gif 1KB Image download
MediaObjects/12982_2022_119_MOESM1_ESM.docx 38KB Other download
Fig. 4 3268KB Image download
12902_2022_1244_Article_IEq8.gif 1KB Image download
MediaObjects/12974_2022_2641_MOESM1_ESM.docx 1099KB Other download
Fig. 3 1070KB Image download
Fig. 1 657KB Image download
Fig. 2 985KB Image download
Fig. 2 642KB Image download
Fig. 1 87KB Image download
Fig. 2 50KB Image download
Fig. 3 56KB Image download
12902_2022_1244_Article_IEq17.gif 1KB Image download
Fig.3 855KB Image download
12902_2022_1244_Article_IEq19.gif 1KB Image download
MediaObjects/12888_2022_4484_MOESM1_ESM.docx 27KB Other download
12902_2022_1244_Article_IEq21.gif 1KB Image download
12902_2022_1244_Article_IEq22.gif 1KB Image download
12902_2022_1244_Article_IEq23.gif 1KB Image download
12902_2022_1244_Article_IEq24.gif 1KB Image download
Fig. 4 1952KB Image download
MediaObjects/12974_2022_2668_MOESM5_ESM.tif 680KB Other download
12902_2022_1244_Article_IEq27.gif 1KB Image download
12902_2022_1244_Article_IEq28.gif 1KB Image download
Fig. 5 499KB Image download
12902_2022_1244_Article_IEq30.gif 1KB Image download
Fig. 1 187KB Image download
MediaObjects/12974_2022_2668_MOESM6_ESM.tif 1339KB Other download
Fig. 6 530KB Image download
Fig. 1 111KB Image download
Fig. 2 331KB Image download
Fig. 2 131KB Image download
12936_2022_4386_Article_IEq84.gif 1KB Image download
MediaObjects/12888_2022_4441_MOESM1_ESM.xlsx 49KB Other download
MediaObjects/12888_2022_4431_MOESM1_ESM.xlsx 14KB Other download
MediaObjects/12888_2022_4441_MOESM2_ESM.xlsx 36KB Other download
MediaObjects/12888_2022_4441_MOESM3_ESM.docx 30KB Other download
Fig. 4 3038KB Image download
Fig. 3 219KB Image download
40644_2022_507_Article_IEq1.gif 1KB Image download
Fig. 1 288KB Image download
Fig. 1 177KB Image download
Fig. 1 163KB Image download
Fig. 2 196KB Image download
MediaObjects/12888_2022_4350_MOESM1_ESM.docx 54KB Other download
MediaObjects/12888_2022_4350_MOESM2_ESM.docx 51KB Other download
MediaObjects/13046_2020_1633_MOESM5_ESM.tif 1424KB Other download
Fig. 7 1742KB Image download
13690_2022_1011_Article_IEq1.gif 1KB Image download
13690_2022_1011_Article_IEq2.gif 1KB Image download
13690_2022_1011_Article_IEq3.gif 1KB Image download
13690_2022_1011_Article_IEq4.gif 1KB Image download
MediaObjects/13690_2022_1011_MOESM1_ESM.xlsx 313KB Other download
MediaObjects/13046_2022_2544_MOESM6_ESM.tif 3616KB Other download
MediaObjects/12888_2022_4428_MOESM1_ESM.docx 35KB Other download
MediaObjects/13690_2022_1011_MOESM2_ESM.xlsx 314KB Other download
MediaObjects/13046_2020_1633_MOESM6_ESM.tif 2817KB Other download
Fig. 6 766KB Image download
Fig. 5 2897KB Image download
Fig. 1 813KB Image download
Fig. 3 401KB Image download
MediaObjects/42004_2022_780_MOESM2_ESM.pdf 5013KB PDF download
12936_2022_4386_Article_IEq117.gif 1KB Image download
Fig. 4 472KB Image download
MediaObjects/12902_2022_1174_MOESM1_ESM.docx 24KB Other download
Fig. 2 970KB Image download
Fig. 6 663KB Image download
Fig. 6 1500KB Image download
MediaObjects/12974_2022_2667_MOESM1_ESM.eps 816KB Other download
MediaObjects/13049_2022_1052_MOESM1_ESM.docx 15KB Other download
MediaObjects/12954_2022_723_MOESM1_ESM.docx 29KB Other download
Fig. 2 233KB Image download
Fig. 3 784KB Image download
Fig. 4 5742KB Image download
Fig. 7 201KB Image download
12902_2022_1222_Article_IEq2.gif 1KB Image download
12936_2022_4386_Article_IEq132.gif 1KB Image download
Fig. 4 542KB Image download
MediaObjects/12974_2022_2659_MOESM1_ESM.pdf 3198KB PDF download
MediaObjects/13046_2022_2544_MOESM7_ESM.tif 5380KB Other download
Fig. 1 1644KB Image download
MediaObjects/13046_2022_2577_MOESM1_ESM.pdf 8331KB PDF download
Fig. 5 2105KB Image download
Fig. 2 2860KB Image download
Fig. 2 541KB Image download
Fig. 1 286KB Image download
Fig. 1 586KB Image download
Fig. 1 253KB Image download
12936_2022_4386_Article_IEq142.gif 1KB Image download
Fig. 6 4373KB Image download
Fig. 2 247KB Image download
Fig. 1 139KB Image download
Fig. 3 176KB Image download
Fig. 1 455KB Image download
MediaObjects/12888_2022_4455_MOESM1_ESM.pdf 112KB PDF download
Fig. 1 3487KB Image download
MediaObjects/12888_2022_4455_MOESM2_ESM.pdf 110KB PDF download
Fig. 6 368KB Image download
Fig. 7 413KB Image download
【 图 表 】

Fig. 7

Fig. 6

Fig. 1

Fig. 1

Fig. 3

Fig. 1

Fig. 2

Fig. 6

12936_2022_4386_Article_IEq142.gif

Fig. 1

Fig. 1

Fig. 1

Fig. 2

Fig. 2

Fig. 5

Fig. 1

Fig. 4

12936_2022_4386_Article_IEq132.gif

12902_2022_1222_Article_IEq2.gif

Fig. 7

Fig. 4

Fig. 3

Fig. 2

Fig. 6

Fig. 6

Fig. 2

Fig. 4

12936_2022_4386_Article_IEq117.gif

Fig. 3

Fig. 1

Fig. 5

Fig. 6

13690_2022_1011_Article_IEq4.gif

13690_2022_1011_Article_IEq3.gif

13690_2022_1011_Article_IEq2.gif

13690_2022_1011_Article_IEq1.gif

Fig. 7

Fig. 2

Fig. 1

Fig. 1

Fig. 1

40644_2022_507_Article_IEq1.gif

Fig. 3

Fig. 4

12936_2022_4386_Article_IEq84.gif

Fig. 2

Fig. 2

Fig. 1

Fig. 6

Fig. 1

12902_2022_1244_Article_IEq30.gif

Fig. 5

12902_2022_1244_Article_IEq28.gif

12902_2022_1244_Article_IEq27.gif

Fig. 4

12902_2022_1244_Article_IEq24.gif

12902_2022_1244_Article_IEq23.gif

12902_2022_1244_Article_IEq22.gif

12902_2022_1244_Article_IEq21.gif

12902_2022_1244_Article_IEq19.gif

Fig.3

12902_2022_1244_Article_IEq17.gif

Fig. 3

Fig. 2

Fig. 1

Fig. 2

Fig. 2

Fig. 1

Fig. 3

12902_2022_1244_Article_IEq8.gif

Fig. 4

12982_2022_119_Article_IEq182.gif

12982_2022_119_Article_IEq181.gif

12982_2022_119_Article_IEq176.gif

12982_2022_119_Article_IEq173.gif

12982_2022_119_Article_IEq171.gif

12982_2022_119_Article_IEq169.gif

12888_2022_4451_Article_IEq2.gif

12982_2022_119_Article_IEq164.gif

12982_2022_119_Article_IEq162.gif

12982_2022_119_Article_IEq160.gif

12982_2022_119_Article_IEq158.gif

12982_2022_119_Article_IEq156.gif

【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  文献评价指标  
  下载次数:3次 浏览次数:3次