CAAI Transactions on Intelligence Technology | |
Enhancing direct-path relative transfer function using deep neural network for robust sound source localization | |
article | |
Bing Yang1  Runwei Ding1  Yutong Ban3  Xiaofei Li2  Hong Liu1  | |
[1] Key Laboratory of Machine Perception, Shenzhen Graduate School, Peking University;Westlake University & Westlake Institute for Advanced Study;SAIIL, Massachusetts General Hospital;CSAIL, Massachusetts Institute of Technology | |
关键词: acoustic generators; acoustic signal processing; direction-of-arrival estimation; reverberation; time-frequency analysis; transfer functions; frequency-domain analysis; microphone arrays; deep learning (artificial intelligence); | |
DOI : 10.1049/cit2.12024 | |
学科分类:数学(综合) | |
来源: Wiley | |
【 摘 要 】
This article proposes a deep neural network (DNN)-based direct-path relative transfer function (DP-RTF) enhancement method for robust direction of arrival (DOA) estimation in noisy and reverberant environments. The DP-RTF refers to the ratio between the direct-path acoustic transfer functions of the two microphone channels. First, the complex-value DP-RTF is decomposed into the inter-channel intensity difference, and sinusoidal functions of the inter-channel phase difference in the time-frequency domain. Then, the decomposed DP-RTF features from a series of temporal context frames are utilized to train a DNN model, which maps the DP-RTF features contaminated by noise and reverberation to the clean ones, and meanwhile provides a time-frequency (TF) weight to indicate the reliability of the mapping. The DP-RTF enhancement network can help to enhance the DP-RTF against noise and reverberation. Finally, the DOA of a sound source can be estimated by integrating the weighted matching between the enhanced DP-RTF features and the DP-RTF templates. Experimental results on simulated data show the superiority of the proposed DP-RTF enhancement network for estimating the DOA of the sound source in the environments with various levels of noise and reverberation.
【 授权许可】
CC BY|CC BY-ND|CC BY-NC|CC BY-NC-ND
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202302050004905ZK.pdf | 1142KB | download |