期刊论文

【摘要】

This paper describes the design of a morpheme based language model for Tamil language. It aims to alleviate the main problems encountered in processing the Tamil language, like enormous vocabulary growth caused by large number of different forms derived for one word. The size of the vocabulary is reduced by decomposing the words into stems and endings and storing these sub word units (morphemes) for training the language model The modified morpheme based language model was applied to avoid the ambiguities in the recognized Tamil words. The perplexity, Out Of Vocabulary (OOV) rate and Word Error Rate (WER) parameters were obtained to check the efficiency of the model for Tamil speech recognition system. The results were compared with the traditional word based statistical bigram and trigram language models. From the results, it was analyzed that the modified morpheme based trigram model with Katz back off smoothing effect improved the performance of the Tamil speech recognition system when compared to the word based N-Gram language models. Keywords: Language model, morphemes, perplexity, out of vocabulary rate, word error rate. Received November 29, 2005; accepted April 18, 2006

【授权许可】

Unknown

【预览】

附件列表
Files	Size	Format	View
RO201912010227905ZK.pdf	1074KB	PDF	download

International Arab Journal of Information Technology (IAJIT)
Morpheme Based Language Model for Tamil Speech Recognition System

Maryam Madani¹ Shadpour Mallakpour²
[1] Department of Chemistry, Isfahan University of Technology, Isfahan 84156-83111, I. R. Iran$$Department of Chemistry, Isfahan University of Technology, Isfahan 84156-83111, I. R. IranDepartment of Chemistry, Isfahan University of Technology, Isfahan 84156-83111, I. R. Iran$$;Department of Chemistry, Isfahan University of Technology, Isfahan 84156-83111, I. R. Iran$$Nanotechnology and Advanced Materials Institute, Isfahan University of Technology, Isfahan 84156-83111, I. R. Iran$$Department of Chemistry, Isfahan University of Technology, Isfahan 84156-83111, I. R. IranDepartment of Chemistry, Isfahan University of Technology, Isfahan 84156-83111, I. R. Iran$$Nanotechnology and Advanced Materials Institute, Isfahan University of Technology, Isfahan 84156-83111, I. R. Iran$$Nanotechnology and Advanced Materials Institute, Isfahan University of Technology, Isfahan 84156-83111, I. R. IranDepartment of Chemistry, Isfahan University of Technology, Isfahan 84156-83111, I. R. Iran$$Nanotechnology and Advanced Materials Institute, Isfahan University of Technology, Isfahan 84156-83111, I. R. Iran$$
关键词: Language model; morphemes; perplexity; out of vocabulary rate; word error rate. ;
DOI :
学科分类：计算机科学（综合）
来源: Zarqa University
PDF


	文献评价指标
	下载次数：10次	浏览次数：23次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】