Chandra, Dhruba ; Dr Paul Franzon, Committee Chair,Dr. R. Rodman, Committee Member,Dr E. Rotenberg, Committee Member,Dr. W. Rhett Davis, Committee Member,Chandra, Dhruba ; Dr Paul Franzon ; Committee Chair ; Dr. R. Rodman ; Committee Member ; Dr E. Rotenberg ; Committee Member ; Dr. W. Rhett Davis ; Committee Member
With computing trend moving towards ubiquitous computing propelled by the advances in embedded mobile processors and battery technology, speech recognition is becoming an essential part of embedded processor I⁄O device. Speech recognition is also used in command and control and automated customer service. Real time speech recognition application is both computation and memory intensive and it overwhelms even a high end multi-gigahertz processor to achieve real time performance. An embedded mobile device cannot support real time large vocabulary speech recognition application as the processors are less aggressive because of tighter power budget. Hardware solution to speech recognition, in the past, have mainly concentrated on buidling specialized hardware or ASIC accelerators to run software speech application faster but have largely ignored design for large vocabulary and power reduction.In this work, we propose a hardware-software co-design for real time large vocabulary speech recognition. Our design has custom ASIC blocks and RAM memories and a low power processor. The processor maintains a high level control over the blocks and processes parts of speech recognition application which is not computation and memory intensive. The custom ASIC computes the Gaussian probability and performs word search in the dictionary. The RAMs are used for storing the intermediate values and states. The design can handle large vocabulary speech recognition in real time on a mobile embedded device. Our word search uses innovative dictionary word layout in memory which reduces bandwidth by a factor of 11 compared to software implementation and by a factor of 4 compared to other ASIC implementation. One unit of our proposed design can perform 4x and 20x better than other proposed design of specialized hardware design for software speech application in computing the Gaussian probability and word search, respectively.