Journal of Computer Science | |
Analysis of Decision Trees in Context Clustering of Hidden Markov Model Based Thai Speech Synthesis | Science Publications | |
Suphattharachai Chomphan1  | |
关键词: Thai speech synthesis; tree-based context clustering; HMM-based speech synthesis; Hidden Markov Model (HMM); Multi-Space probability Distribution (MSD); Minimum Description Length (MDL); synthesis framework; | |
DOI : 10.3844/jcssp.2011.359.365 | |
学科分类:计算机科学(综合) | |
来源: Science Publications | |
【 摘 要 】
Problem statement: In Thai speech synthesis using Hidden Markov model (HMM) basedsynthesis system, the tonal speech quality is degraded due to tone distortion. This major problem mustbe treated appropriately to preserve the tone characteristics of each syllable unit. Since tone bringsabout the intelligibility of the synthesized speech. It is needed to establish the tone questions and otherphonetic questions in tree-based context clustering process accordingly. Approach: This studydescribes the analysis of questions in tree-based context clustering process of an HMM-based speechsynthesis system for Thai language. In the system, spectrum, pitch or F0 and state duration aremodeled simultaneously in a unified framework of HMM, their parameter distributions are clusteredindependently by using a decision-tree based context clustering technique. The contextual factorswhich affect spectrum, pitch and duration, i.e., part of speech, position and number of phones in asyllable, position and number of syllables in a word, position and number of words in a sentence,phone type and tone type, are taken into account for constructing the questions of the decision tree. Allin all, thirteen sets of questions are analyzed in comparison. Results: In the experiment, we analyzedthe decision trees by counting the number of questions in each node coming from those thirteen setsand by calculating the dominance score given to each question as the reciprocal of the distance fromthe root node to the question node. The highest number and dominance score are of the set of phonetictype, while the second, third highest ones are of the set of part of speech and tone type. Conclusion:By counting the number of questions in each node and calculating the dominance score, we can set thepriority of each question set. All in all, the analysis results bring about further development of Thaispeech synthesis with efficient context clustering process in an HMM-based speech synthesis system.
【 授权许可】
Unknown
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO201911300989978ZK.pdf | 172KB | download |