学位论文详细信息
Audio-video based handwritten mathematical content recognition
Pattern recognition;Digital signal processing;Classifier combination
Vemulapalli, Smita ; Electrical and Computer Engineering
University:Georgia Institute of Technology
Department:Electrical and Computer Engineering
关键词: Pattern recognition;    Digital signal processing;    Classifier combination;   
Others  :  https://smartech.gatech.edu/bitstream/1853/45958/1/vemulapalli_smita_201212_phd.pdf
美国|英语
来源: SMARTech Repository
PDF
【 摘 要 】

Recognizing handwritten mathematical content is a challenging problem, and more so when such content appears in classroom videos. However, given the fact that in such videos the handwritten text and the accompanying audio refer to the same content, a combination of video and audio based recognizer has the potential to significantly improve the content recognition accuracy. This dissertation, using a combination of video and audio based recognizers, focuses on improving the recognition accuracy associated with handwritten mathematical content in such videos. Our approach makes use of a video recognizer as the primary recognizer and a multi-stage assembly, developed as part of this research, is used to facilitate effective combination with an audio recognizer. Specifically, we address the following challenges related to audio-video based handwritten mathematical content recognition: (1) Video Preprocessing - generates a timestamped sequence of segmented characters from the classroom video in the face of occlusions and shadows caused by the instructor, (2) Ambiguity Detection - determines the subset of input characters that may have been incorrectly recognized by the video based recognizer and forwards this subset for disambiguation, (3) A/V Synchronization - establishes correspondence between the handwritten character and the spoken content, (4) A/V Combination - combines the synchronized outputs from the video and audio based recognizers and generates the final recognized character, and (5) Grammar Assisted A/V Based Mathematical Content Recognition - utilizes a base mathematical speech grammar for both character and structure disambiguation. Experiments conducted using videos recorded in a classroom-like environment demonstrate the significant improvements in recognition accuracy that can be achieved using our techniques.

【 预 览 】
附件列表
Files Size Format View
Audio-video based handwritten mathematical content recognition 3394KB PDF download
  文献评价指标  
  下载次数:17次 浏览次数:4次