期刊论文详细信息
Philippine Information Technology Journal
Probabilistic Speech Recognition for Tagalog Lecture Video (PSRTLV)
Maravillas, Elmer A.1  Montenegro, Chuchi S.1 
关键词: Speech Recognition;    Digital Signal Processing;   
DOI  :  10.3860/pitj.v3i1.2715
学科分类:计算机科学(综合)
来源: Philippine Society of Information Technology Educators
PDF
【 摘 要 】

This project is a development of a system that would allow the searching of Tagalog texts from Tagalog-spoken speeches. This project makes use of pattern matching technique to return the most likely time occurrence of the searched word. Audio files are sampled at 16 KHz 16 bit mono format in a controlled environment, windowed at 256 samples per frame. Transformation of the signal into its frequency domain is done using a windowed Fast Fourier Transform (FFT). End-point detection algorithm is used to classify voiced and unvoiced signal of the sampled audio files. The FFT analyzes each of the voiced signals and converts the audio data into the frequency domain. Each voiced signal classification results represent a graph of the amplitudes of frequency components, describing the sound heard for that particular signal. Probabilistic Speech Recognition for Tagalog Lecture Video (PSRTLV) encompasses a database of such graphs (called a codebook) that identify different types of sounds the human voice can make. The sound is identified by matching it to its closest entry in the codebook using Euclidean distance computation. Experimental results yield an average recognition rate of 30%.

【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO201912020437777ZK.pdf 16KB PDF download
  文献评价指标  
  下载次数:10次 浏览次数:33次