期刊论文

【摘要】

This paper proposes a method for constructing text-to-speech (TTS) systems for languages with unknown pronunciations. One goal of speech synthesis research is to establish a framework that can be used to construct TTS systems for any written language. Generally, language-specific knowledge is required to construct TTS systems for a new language. However, it is difficult to acquire language-specific knowledge in each new language. Therefore, constructing a TTS system for a new language entails huge costs. To address this problem, we investigate a framework for automatically constructing a TTS system from a target language database consisting of only speech data and corresponding Unicode texts. In the proposed method, pseudo phonetic information of the target language with unknown pronunciation is obtained by a speech recognizer of a rich-resource proxy language. Then, a grapheme-to-phoneme converter and a statistical parametric speech synthesizer are constructed based on the obtained pseudo phonetic information. The proposed method was applied to Japanese and was evaluated in terms of objective and subjective measures. Additionally, we challenged the construction of TTS systems for nine Indian languages using the proposed method, and TTS systems were evaluated in the Blizzard Challenge 2014 and 2015.

【授权许可】

Unknown

【预览】

附件列表
Files	Size	Format	View
RO201910182904123ZK.pdf	811KB	PDF	download

Acoustical science and technology
Constructing text-to-speech systems for languages with unknown pronunciations

Yoshihiko Nankaku¹ Kei Sawada¹ Kei Hashimoto¹ Keiichiro Oura¹ Keiichi Tokuda¹
[1] Department of Scientific and Engineering Simulation, Nagoya Institute of Technology
关键词: Text-to-speech system; Statistical parametric speech synthesis; Unknown pronunciation language; Low-resource language; Language-independent method;
DOI : 10.1250/ast.39.119
学科分类：声学和超声波
来源: Acoustical Society of Japan
PDF


	文献评价指标
	下载次数：25次	浏览次数：10次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】