Speech production errors characteristic of dysarthria are chiefly responsible for the low accuracy of automatic speech recognition (ASR) when used by people diagnosed with the condition. The results of the small number of speech recognition studies, mostly conducted by assistive technology researchers, are a testimony to this statement. In the engineering community, substantial research has been conducted to find algorithms that adapt models of speech acoustics trained on one dataset for use with another. They are mostly mathematically motivated.A person with dysarthria produces speech in a rather reduced acoustic working space, causing typical measures of speech acoustics to have values in ranges very different from those characterizing unimpaired speech. It is unlikely then that models trained on unimpaired speech will be able to adjust to this mismatch when acted on by one of the above-mentioned adaptation algorithms. The creation of acoustic models trained exclusively on pathological speech too is a task difficult to achieve: members of this population find it tiring to pursue physical activities for sustained periods of time, including speech production. While this makes speaker adaptation an approach worthy of pursuit, almost no research has been conducted so far on acoustic model adaptation methods for recognition of dysarthric speech.This dissertation presents a study of acoustic model adaptation for recognition of dysarthric speech. First, it investigates the efficacy of a popular adaptation algorithm for dysarthric speech recognition. It then proposes an additional step in the adaptation process, to separately model 'normal' and pathology-induced variations in speech characteristics, and does so by trying to account for a recently proposed view of the acoustics of motor speech disorders in the clinical research community. Results show that explicitly addressing the population mismatch helps to increase the recognition accuracy.
【 预 览 】
附件列表
Files
Size
Format
View
Acoustic model adaptation for recognition of dysarthric speech