EURASIP Journal on Audio, Speech, and Music Processing | |
Learning-based robust speaker counting and separation with the aid of spatial coherence | |
Empirical Research | |
Yicheng Hsu1  Mingsian R. Bai2  | |
[1] Department of Power Mechanical Engineering, National Tsing Hua University, Hsinchu, Taiwan;Department of Power Mechanical Engineering, National Tsing Hua University, Hsinchu, Taiwan;Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan; | |
关键词: Multichannel blind source separation; Speaker counting and separation; Spatial coherence; Neural network; | |
DOI : 10.1186/s13636-023-00298-3 | |
received in 2023-03-14, accepted in 2023-08-07, 发布年份 2023 | |
来源: Springer | |
【 摘 要 】
A three-stage approach is proposed for speaker counting and speech separation in noisy and reverberant environments. In the spatial feature extraction, a spatial coherence matrix (SCM) is computed using whitened relative transfer functions (wRTFs) across time frames. The global activity functions of each speaker are estimated from a simplex constructed using the eigenvectors of the SCM, while the local coherence functions are computed from the coherence between the wRTFs of a time-frequency bin and the global activity function-weighted RTF of the target speaker. In speaker counting, we use the eigenvalues of the SCM and the maximum similarity of the interframe global activity distributions between two speakers as the input features to the speaker counting network (SCnet). In speaker separation, a global and local activity-driven network (GLADnet) is used to extract each independent speaker signal, which is particularly useful for highly overlapping speech signals. Experimental results obtained from the real meeting recordings show that the proposed system achieves superior speaker counting and speaker separation performance compared to previous publications without the prior knowledge of the array configurations.
【 授权许可】
CC BY
© Springer Nature Switzerland AG 2023
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202310115458910ZK.pdf | 5455KB | download | |
13690_2023_1170_Article_IEq125.gif | 1KB | Image | download |
40677_2023_249_Article_IEq13.gif | 1KB | Image | download |
13690_2023_1170_Article_IEq16.gif | 1KB | Image | download |
13690_2023_1170_Article_IEq18.gif | 1KB | Image | download |
Fig. 3 | 1147KB | Image | download |
MediaObjects/40249_2023_1139_MOESM3_ESM.docx | 18KB | Other | download |
Fig. 3 | 53KB | Image | download |
13690_2023_1170_Article_IEq144.gif | 1KB | Image | download |
13731_2023_332_Article_IEq7.gif | 1KB | Image | download |
13731_2023_332_Article_IEq8.gif | 1KB | Image | download |
Fig. 1 | 469KB | Image | download |
MediaObjects/12888_2023_5147_MOESM1_ESM.pdf | 402KB | download | |
Fig. 1 | 167KB | Image | download |
MediaObjects/12888_2023_5155_MOESM3_ESM.docx | 16KB | Other | download |
13690_2023_1170_Article_IEq155.gif | 1KB | Image | download |
13690_2023_1170_Article_IEq48.gif | 1KB | Image | download |
13690_2023_1170_Article_IEq49.gif | 1KB | Image | download |
13690_2023_1170_Article_IEq51.gif | 1KB | Image | download |
Fig. 6 | 740KB | Image | download |
13690_2023_1170_Article_IEq53.gif | 1KB | Image | download |
13690_2023_1170_Article_IEq162.gif | 1KB | Image | download |
Fig. 2 | 405KB | Image | download |
Fig. 2 | 36KB | Image | download |
13690_2023_1170_Article_IEq164.gif | 1KB | Image | download |
13690_2023_1170_Article_IEq57.gif | 1KB | Image | download |
Fig. 5 | 547KB | Image | download |
13690_2023_1170_Article_IEq58.gif | 1KB | Image | download |
40249_2023_1135_Article_IEq4.gif | 1KB | Image | download |
MediaObjects/12951_2023_2053_MOESM1_ESM.docx | 3328KB | Other | download |
Fig. 2 | 512KB | Image | download |
Fig. 8 | 1595KB | Image | download |
672KB | Image | download | |
12862_2023_2158_Article_IEq45.gif | 1KB | Image | download |
Fig. 4 | 815KB | Image | download |
Fig. 7 | 1087KB | Image | download |
MediaObjects/40249_2023_1131_MOESM1_ESM.docx | 246KB | Other | download |
12951_2023_2095_Article_IEq1.gif | 1KB | Image | download |
Fig. 1 | 4172KB | Image | download |
13690_2023_1170_Article_IEq165.gif | 1KB | Image | download |
13690_2023_1170_Article_IEq167.gif | 1KB | Image | download |
12951_2023_2095_Article_IEq5.gif | 1KB | Image | download |
12951_2023_2095_Article_IEq6.gif | 1KB | Image | download |
Fig. 4 | 954KB | Image | download |
MediaObjects/42004_2023_1007_MOESM1_ESM.pdf | 3331KB | download | |
Fig. 1 | 107KB | Image | download |
13690_2023_1170_Article_IEq72.gif | 1KB | Image | download |
13690_2023_1170_Article_IEq181.gif | 1KB | Image | download |
MediaObjects/12974_2023_2894_MOESM1_ESM.docx | 14968KB | Other | download |
Fig. 3 | 907KB | Image | download |
Fig. 4 | 153KB | Image | download |
Fig. 5 | 1051KB | Image | download |
13690_2023_1170_Article_IEq185.gif | 1KB | Image | download |
Fig.4 | 335KB | Image | download |
MediaObjects/12888_2023_5157_MOESM1_ESM.docx | 588KB | Other | download |
12888_2023_5142_Article_IEq15.gif | 1KB | Image | download |
12888_2023_5142_Article_IEq16.gif | 1KB | Image | download |
12888_2023_5142_Article_IEq17.gif | 1KB | Image | download |
Fig. 8 | 2622KB | Image | download |
13690_2023_1170_Article_IEq209.gif | 1KB | Image | download |
13690_2023_1170_Article_IEq211.gif | 1KB | Image | download |
Fig. 3 | 786KB | Image | download |
13690_2023_1170_Article_IEq100.gif | 1KB | Image | download |
12888_2023_5172_Article_IEq12.gif | 1KB | Image | download |
562KB | Image | download | |
MediaObjects/13100_2023_301_MOESM8_ESM.pdf | 53KB | download | |
703KB | Image | download | |
Fig. 3 | 919KB | Image | download |
Fig. 4 | 915KB | Image | download |
Fig. 7 | 171KB | Image | download |
Fig. 5 | 1835KB | Image | download |
Fig. 5 | 1561KB | Image | download |
MediaObjects/13068_2023_2396_MOESM2_ESM.tif | 117KB | Other | download |
MediaObjects/12888_2023_5175_MOESM1_ESM.docx | 22KB | Other | download |
MediaObjects/13068_2023_2396_MOESM3_ESM.tif | 2635KB | Other | download |
MediaObjects/12888_2023_5175_MOESM2_ESM.docx | 32KB | Other | download |
13690_2023_1170_Article_IEq3.gif | 1KB | Image | download |
Fig. 2 | 2464KB | Image | download |
Fig. 1 | 316KB | Image | download |
MediaObjects/42004_2023_1004_MOESM3_ESM.pdf | 127KB | download | |
12888_2023_5172_Article_IEq28.gif | 1KB | Image | download |
12888_2023_5172_Article_IEq29.gif | 1KB | Image | download |
12888_2023_5172_Article_IEq30.gif | 1KB | Image | download |
Fig. 5 | 767KB | Image | download |
12888_2023_5172_Article_IEq31.gif | 1KB | Image | download |
Fig. 1 | 230KB | Image | download |
MediaObjects/13690_2023_1188_MOESM1_ESM.docx | 22KB | Other | download |
Fig. 4 | 1314KB | Image | download |
Fig. 3 | 1148KB | Image | download |
MediaObjects/13046_2023_2781_MOESM14_ESM.jpg | 282KB | Other | download |
Fig. 9 | 568KB | Image | download |
Fig. 3 | 831KB | Image | download |
MediaObjects/13046_2023_2804_MOESM1_ESM.zip | 5520KB | Package | download |
MediaObjects/41408_2023_905_MOESM1_ESM.docx | 75KB | Other | download |
Fig. 4 | 18KB | Image | download |
Fig. 1 | 682KB | Image | download |
【 图 表 】
Fig. 1
Fig. 4
Fig. 3
Fig. 9
Fig. 3
Fig. 4
Fig. 1
12888_2023_5172_Article_IEq31.gif
Fig. 5
12888_2023_5172_Article_IEq30.gif
12888_2023_5172_Article_IEq29.gif
12888_2023_5172_Article_IEq28.gif
Fig. 1
Fig. 2
13690_2023_1170_Article_IEq3.gif
Fig. 5
Fig. 5
Fig. 7
Fig. 4
Fig. 3
12888_2023_5172_Article_IEq12.gif
13690_2023_1170_Article_IEq100.gif
Fig. 3
13690_2023_1170_Article_IEq211.gif
13690_2023_1170_Article_IEq209.gif
Fig. 8
12888_2023_5142_Article_IEq17.gif
12888_2023_5142_Article_IEq16.gif
12888_2023_5142_Article_IEq15.gif
Fig.4
13690_2023_1170_Article_IEq185.gif
Fig. 5
Fig. 4
Fig. 3
13690_2023_1170_Article_IEq181.gif
13690_2023_1170_Article_IEq72.gif
Fig. 1
Fig. 4
12951_2023_2095_Article_IEq6.gif
12951_2023_2095_Article_IEq5.gif
13690_2023_1170_Article_IEq167.gif
13690_2023_1170_Article_IEq165.gif
Fig. 1
12951_2023_2095_Article_IEq1.gif
Fig. 7
Fig. 4
12862_2023_2158_Article_IEq45.gif
Fig. 8
Fig. 2
40249_2023_1135_Article_IEq4.gif
13690_2023_1170_Article_IEq58.gif
Fig. 5
13690_2023_1170_Article_IEq57.gif
13690_2023_1170_Article_IEq164.gif
Fig. 2
Fig. 2
13690_2023_1170_Article_IEq162.gif
13690_2023_1170_Article_IEq53.gif
Fig. 6
13690_2023_1170_Article_IEq51.gif
13690_2023_1170_Article_IEq49.gif
13690_2023_1170_Article_IEq48.gif
13690_2023_1170_Article_IEq155.gif
Fig. 1
Fig. 1
13731_2023_332_Article_IEq8.gif
13731_2023_332_Article_IEq7.gif
13690_2023_1170_Article_IEq144.gif
Fig. 3
Fig. 3
13690_2023_1170_Article_IEq18.gif
13690_2023_1170_Article_IEq16.gif
40677_2023_249_Article_IEq13.gif
13690_2023_1170_Article_IEq125.gif
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
- [35]
- [36]
- [37]
- [38]
- [39]
- [40]
- [41]
- [42]
- [43]
- [44]
- [45]
- [46]
- [47]
- [48]
- [49]
- [50]
- [51]
- [52]
- [53]
- [54]
- [55]
- [56]
- [57]
- [58]
- [59]
- [60]