Signal Processing: An International Journal | |
A Gaussian Clustering Based Voice Activity Detector for Noisy Environments Using Spectro-Temporal Domain | |
Farbod Razzazi1  Azim Fard1  Sara Valipour1  | |
[1] $$ | |
关键词: Voice activity detector; Spectro-temporal Domain; Gaussian modeling; Auditory model; | |
DOI : | |
学科分类:物理(综合) | |
来源: Computer Science Journals | |
【 摘 要 】
In this paper, a voice activity detector is proposed on the basis of Gaussian modeling of noise in the spectro-temporal space. Spectro-temporal space is obtained from auditory cortical processing. The auditory model that offers a multi-dimensional picture of the sound includes two stages: the initial stage is a model of inner ear and the second stage is the auditory central cortical modeling in the brain. In this paper, the speech noise in this picture has been modeled by a 3-D mono Gaussian cluster. At the start of suggested VAD process, the noise is modeled by a Gaussian shaped cluster. The average noise behavior is obtained in different spectrotemporal space in various points for each frame. In the stage of separation of speech from noise, the criterion is the difference between the average noise behavior and the speech signal amplitude in spectrotemporal domain. This was measured for each frame and was used as the criterion of classification. Using Noisex92, this method is tested in different noise models such as White, exhibition, Street, Office and Train noises. The results are compared to both auditory model and multifeature method. It is observed that the performance of this method in low signal-to-noise ratios (SNRs) conditions is better than other current methods.
【 授权许可】
Unknown
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO201912040511362ZK.pdf | 337KB | download |