期刊论文

【摘要】

References(9)Non-audible murmur (NAM) is an unvoiced speech received through body tissue using special acoustic sensors (i.e., NAM microphones) attached behind the talkers ear. Although NAM has different frequency characteristics compared to normal speech, it is possible to perform automatic speech recognition (ASR) using conventional methods. In using a NAM microphone, body transmission and the loss of lip radiation act as a low-pass filter; as a result, higher frequency components are attenuated in NAM signal. A decrease in NAM recognition performance is attributed to spectral reduction. To address the problem of loss of lip radiation, visual information extracted from the talker's facial movements is fused with NAM speech. Experimental results revealed a relative improvement of 39% when fused NAM speech and facial information were used as compared to using only NAM speech. Results also showed that improvements in the recognition rate depend on the place of articulation.

【授权许可】

Unknown

【预览】

附件列表
Files	Size	Format	View
RO201911300485042ZK.pdf	211KB	PDF	download

IEICE Electronics Express
Exploiting visual information for NAM recognition

Denis Beautemps¹ Gérard Bailly¹ Helene Loevenbruck¹ Panikos Heracleous¹ Viet-Anh Tran¹
[1] GIPSA-lab, Speech and Cognition Department CNRS UMR 5216 / Stendhal University / UJF / INPG
关键词: NAM; speech recognition; facial movements; fusion;
DOI : 10.1587/elex.6.77
学科分类：电子、光学、磁材料
来源: Denshi Jouhou Tsuushin Gakkai
PDF


	文献评价指标
	下载次数：3次	浏览次数：1次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】