期刊论文

【摘要】

This paper focuses on combining audio-visual signals for Polish speech recognition in conditions of the highly disturbed audio speech signal. Recognition of audio-visual speech was based on combined hidden Markov models (CHMM). The described methods were developed for a single isolated command, nevertheless their effectiveness indicated that they would also work similarly in continuous audiovisual speech recognition. The problem of a visual speech analysis is very difficult and computationally demanding, mostly because of an extreme amount of data that needs to be processed. Therefore, the method of audio-video speech recognition is used only while the audiospeech signal is exposed to a considerable level of distortion. There are proposed the authors’ own methods of the lip edges detection and a visual characteristic extraction in this paper. Moreover, the method of fusing speech characteristics for an audio-video signal was proposed and tested. A significant increase of recognition effectiveness and processing speed were noted during tests - for properly selected CHMM parameters and an adequate codebook size, besides the use of the appropriate fusion of audio-visual characteristics. The experimental results were very promising and close to those achieved by leading scientists in the field of audio-visual speech recognition.

【授权许可】

Unknown

【预览】

附件列表
Files	Size	Format	View
RO201902182626791ZK.pdf	1138KB	PDF	download

Bulletin of the Polish Academy of Sciences. Technical Sciences
Characteristics of the use of coupled hidden Markov models for audio-visual polish speech recognition

M. KubanekInstitute of Computer and Information Sciences, Czestochowa University of Technology, 73 D?browskiego St., 42-200 Cz?stochowa, PolandOther articles by this author:De Gruyter OnlineGoogle Scholar¹ J. BobulskiInstitute of Computer and Information Sciences, Czestochowa University of Technology, 73 D?browskiego St., 42-200 Cz?stochowa, PolandOther articles by this author:De Gruyter OnlineGoogle Scholar¹ L. AdrjanowiczInstitute of Computer and Information Sciences, Czestochowa University of Technology, 73 D?browskiego St., 42-200 Cz?stochowa, PolandOther articles by this author:De Gruyter OnlineGoogle Scholar¹
[1] Institute of Computer and Information Sciences, Czestochowa University of Technology, 73 D?browskiego St., 42-200 Cz?stochowa, Poland
关键词: Keywords: : coupled hidden Markov models; audio-visual speech recognition; lip reading.;
DOI : 10.2478/v10175-012-0041-6
学科分类：工程和技术（综合）
来源: Polska Akademia Nauk * Centrum Upowszechniania Nauki / Polish Academy of Sciences, Center for the Advancement of Science
PDF


	文献评价指标
	下载次数：3次	浏览次数：1次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】