期刊论文

【摘要】

This project is a development of a system that would allow the searching of Tagalog texts from Tagalog-spoken speeches. This project makes use of pattern matching technique to return the most likely time occurrence of the searched word. Audio files are sampled at 16 KHz 16 bit mono format in a controlled environment, windowed at 256 samples per frame. Transformation of the signal into its frequency domain is done using a windowed Fast Fourier Transform (FFT). End-point detection algorithm is used to classify voiced and unvoiced signal of the sampled audio files. The FFT analyzes each of the voiced signals and converts the audio data into the frequency domain. Each voiced signal classification results represent a graph of the amplitudes of frequency components, describing the sound heard for that particular signal. Probabilistic Speech Recognition for Tagalog Lecture Video (PSRTLV) encompasses a database of such graphs (called a codebook) that identify different types of sounds the human voice can make. The sound is identified by matching it to its closest entry in the codebook using Euclidean distance computation. Experimental results yield an average recognition rate of 30%.

【授权许可】

Unknown

【预览】

附件列表
Files	Size	Format	View
RO201912020437777ZK.pdf	16KB	PDF	download

Philippine Information Technology Journal
Probabilistic Speech Recognition for Tagalog Lecture Video (PSRTLV)

Maravillas, Elmer A.¹ Montenegro, Chuchi S.¹
关键词: Speech Recognition; Digital Signal Processing;
DOI : 10.3860/pitj.v3i1.2715
学科分类：计算机科学（综合）
来源: Philippine Society of Information Technology Educators
PDF


	文献评价指标
	下载次数：10次	浏览次数：33次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】