Philippine Information Technology Journal | |
Probabilistic Speech Recognition for Tagalog Lecture Video (PSRTLV) | |
Maravillas, Elmer A.1  Montenegro, Chuchi S.1  | |
关键词: Speech Recognition; Digital Signal Processing; | |
DOI : 10.3860/pitj.v3i1.2715 | |
学科分类:计算机科学(综合) | |
来源: Philippine Society of Information Technology Educators | |
【 摘 要 】
This project is a development of a system that would allow the searching of Tagalog texts from Tagalog-spoken speeches. This project makes use of pattern matching technique to return the most likely time occurrence of the searched word. Audio files are sampled at 16 KHz 16 bit mono format in a controlled environment, windowed at 256 samples per frame. Transformation of the signal into its frequency domain is done using a windowed Fast Fourier Transform (FFT). End-point detection algorithm is used to classify voiced and unvoiced signal of the sampled audio files. The FFT analyzes each of the voiced signals and converts the audio data into the frequency domain. Each voiced signal classification results represent a graph of the amplitudes of frequency components, describing the sound heard for that particular signal. Probabilistic Speech Recognition for Tagalog Lecture Video (PSRTLV) encompasses a database of such graphs (called a codebook) that identify different types of sounds the human voice can make. The sound is identified by matching it to its closest entry in the codebook using Euclidean distance computation. Experimental results yield an average recognition rate of 30%.
【 授权许可】
Unknown
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO201912020437777ZK.pdf | 16KB | download |