学位论文

【摘要】

Paralinguistic events are useful indicators of the affective state of a speaker. These cues, in children's speech, are used to form social bonds with their caregivers. They have also been found to be useful in the very early detection of developmental disorders such as autism spectrum disorder (ASD) in children's speech. Prior work on children's speech has focused on the use of a limited number of subjects which don't have sufficient diversity in the type of vocalizations that are produced. Also, the features that are necessary to understand the production of paralinguistic events is not fully understood. To account for the lack of an off-the-shelf solution to detect instances of laughter and crying in children's speech, the focus of the thesis is to investigate and develop signal processing algorithms to extract acoustic features and use machine learning algorithms on various corpora. Results obtained using baseline spectral and prosodic features indicate the ability of the combination of spectral, prosodic, and dysphonation-related features that are needed to detect laughter and whining in toddlers' speech with different age groups and recording environments. The use of long-term features were found to be useful to capture the periodic properties of laughter in adults' and children's speech and detected instances of laughter to a high degree of accuracy. Finally, the thesis focuses on the use of multi-modal information using acoustic features and computer vision-based smile-related features to detect instances of laughter and to reduce the instances of false positives in adults' and children's speech. The fusion of the features resulted in an improvement of the accuracy and recall rates than when using either of the two modalities on their own.

【预览】

附件列表
Files	Size	Format	View
Paralinguistic event detection in children's speech	3676KB	PDF	download


Paralinguistic event detection in children's speech
Paralinguistic;Speech signal processing;Pattern recognition
Rao, Hrishikesh ; Clements, Mark A. Electrical and Computer Engineering Moore, Elliot Essa, Irfan Anderson, David Rozga, Agata ; Clements, Mark A.
University:Georgia Institute of Technology
Department:Electrical and Computer Engineering
关键词: Paralinguistic; Speech signal processing; Pattern recognition;
Others : https://smartech.gatech.edu/bitstream/1853/54332/1/RAO-DISSERTATION-2015.pdf
美国\|英语
来源: SMARTech Repository
PDF


	文献评价指标
	下载次数：17次	浏览次数：6次

【 摘 要 】

【 预 览 】

【摘要】

【预览】