The objective of this dissertation is to develop diarization algorithms for LENA data and study its application to compute language behavior statistics for individuals with autism. LENA device is one of the most commonly used devices to collect audio data in autism and language development studies. LENA child and adult detector algorithms were evaluated for two different datasets: i) older children dataset consisting of children already diagnosed with autism spectrum disor-der and ii) infants dataset consisting of infants at risk for autism. I-vector based diarization algorithms were developed for the two datasets to tackle two scenarios: a) some amount of labeled data is present for every speaker present in the audio recording and b) no labeled data is present for the audio recording to be diarized. Further, i-vector based diarizationmethods were applied to compute objective measures of assessment. These objective measures of assessment were analyzed to show they can reveal some aspects of autism severity. Also, a method to extract a 5 minute high child vocalization audio window from a 16hour day long recording was developed, which was then used to compute canonical babble statistics using human annotation.
【 预 览 】
附件列表
Files
Size
Format
View
Audio diarization for LENA data and its application to computing language behavior statistics for individuals with autism