期刊论文

【摘要】

This article evaluates the performance of Extreme Learning Machine (ELM) and Gaussian Mixture Model (GMM) in the context of text independent Multi lingual speaker identification for recorded and synthesized speeches. The type and number of filters in the filter bank, number of samples in each frame of the speech signal and fusion of model scores play a vital role in speaker identification accuracy and are analyzed in this article. Extreme Learning Machine uses a single hidden layer feed forward neural network for multilingual speaker identification. The individual Gaussian components of GMM best represent speaker-dependent spectral shapes that are effective in speaker identity. Both the modeling techniques make use of Linear Predictive Residual Cepstral Coefficient (LPRCC), Mel Frequency Cepstral Coefficient (MFCC), Modified Mel Frequency Cepstral Coefficient (MMFCC) and Bark Frequency Cepstral Coefficient (BFCC) features to represent the speaker specific attributes of speech signals. Experimental results show that GMM outperforms ELM with speaker identification accuracy of 97.5% with frame size of 256 and frame shift of half of frame size and filter bank size of 40.

【授权许可】

Unknown

【预览】

附件列表
Files	Size	Format	View
RO201911300268251ZK.pdf	460KB	PDF	download

Journal of Computer Science
A FRAMEWORK FOR MULTILINGUAL TEXT- INDEPENDENT SPEAKER IDENTIFICATION SYSTEM \| Science Publications

Sundaradhas Selva Nidhyananthan¹ Ramapackiam Shantha Selva Kumari¹
关键词: GMM; ELM; MFCC; Filter Bank; Multi Lingual Speaker Identification;
DOI : 10.3844/jcssp.2014.178.189
学科分类：计算机科学（综合）
来源: Science Publications
PDF


	文献评价指标
	下载次数：2次	浏览次数：6次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】