期刊论文

【摘要】

During speech perception, humans integrate auditory information from the voice with visual information from the face. This multisensory integration increases perceptual precision, but only if the two cues come from the same talker; this requirement has been largely ignored by current models of speech perception. We describe a generative model of multisensory speech perception that includes this critical step of determining the likelihood that the voice and face information have a common cause. A key feature of the model is that it is based on a principled analysis of how an observer should solve this causal inference problem using the asynchrony between two cues and the reliability of the cues. This allows the model to make predictions about the behavior of subjects performing a synchrony judgment task, predictive power that does not exist in other approaches, such as post-hoc fitting of Gaussian curves to behavioral data. We tested the model predictions against the performance of 37 subjects performing a synchrony judgment task viewing audiovisual speech under a variety of manipulations, including varying asynchronies, intelligibility, and visual cue reliability. The causal inference model outperformed the Gaussian model across two experiments, providing a better fit to the behavioral data with fewer parameters. Because the causal inference model is derived from a principled understanding of the task, model parameters are directly interpretable in terms of stimulus and subject properties.

【授权许可】

CC BY

【预览】

附件列表
Files	Size	Format	View
RO201901223992100ZK.pdf	1288KB	PDF	download

Frontiers in Psychology
Causal inference of asynchronous audiovisual speech

John F. Magnotti¹
关键词: causal inference; synchrony judgments; speech perception; multisensory integration; Bayesian observer;
DOI : 10.3389/fpsyg.2013.00798
学科分类：心理学（综合）
来源: Frontiers
PDF


	文献评价指标
	下载次数：13次	浏览次数：15次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】