期刊论文

【摘要】

This paper investigates the importance of temporal cues in the perception of speaker individuality and vocal emotion. Experiments of speaker and vocal-emotion recognition were carried out using an analysis/synthesis method of noise-vocoded speech (NVS). The temporal resolution of NVS was controlled by varying the upper limit of modulation frequency (0, 0.5, 1, 2, 4, 8, 16, 32, and 64 Hz). In addition, the role of temporal cue in the different spectral resolution condition was also investigated by varying the number of channels (4, 8, and 16). The results demonstrated that temporal resolution contributes to the recognition of both speaker and vocal emotion. Therefore, temporal cues are found to be important for the perception of not only linguistic information but also speaker individuality and vocal emotion. On the other hand, the performance of speaker recognition was less sensitive to the spectral resolution, at least in the limited set of stimuli in the present study. For vocal-emotion recognition, the spectral resolution was shown to be important for recognizing only neutral, joy, and cold anger, but not sadness or hot anger. The important modulation frequency band for the perception of nonlinguistic information was suggested to be higher than that of linguistic information.

【授权许可】

Unknown

【预览】

附件列表
Files	Size	Format	View
RO201910189233823ZK.pdf	492KB	PDF	download

Acoustical science and technology
Contributions of temporal cue on the perception of speaker individuality and vocal emotion for noise-vocoded speech

Zhi Zhu¹ Ryota Miyauchi¹ Masashi Unoki¹ Yukiko Araki²
[1] Japan Advanced Institute of Science and Technology;Kanazawa University
关键词: Temporal cue; Speaker individuality; Vocal emotion; Noise-vocoded speech; Speech perception;
DOI : 10.1250/ast.39.234
学科分类：声学和超声波
来源: Acoustical Society of Japan
PDF


	文献评价指标
	下载次数：19次	浏览次数：4次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】