BioMedical Engineering OnLine | |
A magnetic resonance imaging study on the articulatory and acoustic speech parameters of Malay vowels | |
Alireza Zourmand4  Seyed Mostafa Mirhassani4  Hua-Nong Ting4  Shaik Ismail Bux3  Kwan Hoong Ng3  Mehmet Bilgen2  Mohd Amin Jalaludin1  | |
[1] Department of Otorhinolaringology, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia | |
[2] Radiology Department, Faculty of Medicine, Erciyes University, 38039 Kayseri, Turkey | |
[3] University Malaya Research Imaging Center and Department of Biomedical Imaging, University of Malaya, Kuala Lumpur, Malaysia | |
[4] Biomedical Engineering Department, Faculty of Engineering, University of Malaya, Kuala Lumpur, Malaysia | |
关键词: Formant frequencies; Acoustic parameters; Active contour; Malay vowel sounds; Articulators’ movements; Vocal tract shape; | |
Others : 1097944 DOI : 10.1186/1475-925X-13-103 |
|
received in 2013-09-20, accepted in 2014-07-14, 发布年份 2014 | |
【 摘 要 】
The phonetic properties of six Malay vowels are investigated using magnetic resonance imaging (MRI) to visualize the vocal tract in order to obtain dynamic articulatory parameters during speech production. To resolve image blurring due to the tongue movement during the scanning process, a method based on active contour extraction is used to track tongue contours. The proposed method efficiently tracks tongue contours despite the partial blurring of MRI images. Consequently, the articulatory parameters that are effectively measured as tongue movement is observed, and the specific shape of the tongue and its position for all six uttered Malay vowels are determined.
Speech rehabilitation procedure demands some kind of visual perceivable prototype of speech articulation. To investigate the validity of the measured articulatory parameters based on acoustic theory of speech production, an acoustic analysis based on the uttered vowels by subjects has been performed. As the acoustic speech and articulatory parameters of uttered speech were examined, a correlation between formant frequencies and articulatory parameters was observed. The experiments reported a positive correlation between the constriction location of the tongue body and the first formant frequency, as well as a negative correlation between the constriction location of the tongue tip and the second formant frequency. The results demonstrate that the proposed method is an effective tool for the dynamic study of speech production.
【 授权许可】
2014 Zourmand et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150131011230740.pdf | 3501KB | download | |
Figure 9. | 66KB | Image | download |
Figure 8. | 66KB | Image | download |
Figure 7. | 105KB | Image | download |
Figure 6. | 79KB | Image | download |
Figure 5. | 85KB | Image | download |
Figure 4. | 114KB | Image | download |
Figure 3. | 70KB | Image | download |
Figure 2. | 83KB | Image | download |
Figure 1. | 58KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.
Figure 8.
Figure 9.
【 参考文献 】
- [1]Fant G: Acoustic theory of speech production. The Netherlands: 's-Gravenhage: Mouton; 1960.
- [2]Perkell JS: Physiology of Speech Production: Results and Implications of a Quantitative Cineradiographic Study. Cambridge, Mass: MIT Press; 1969.
- [3]Kim H: Gradual tongue movements in Korean Palatalization as coarticulation: new evidence from stroboscopic cine-MRI and acoustic data. J Phon 2012, 40:67-81.
- [4]Takano S, Honda K: An MRI analysis of the extrinsic tongue muscles during vowel production. Speech Comm 2007, 49:49-58.
- [5]Kim H, Honda K, Maeda S: Stroboscopic-cine MRI study of the phasing between the tongue and the larynx in the Korean three-way phonation contrast. J Phon 2005, 33:1-26.
- [6]Kim H, Maeda S, Honda K: Invariant articulatory bases of the features [tense] and [spread glottis] in Korean plosives: New stroboscopic cine-MRI data. J Phon 2010, 38:90-108.
- [7]Chiba T, Kajiyama M: The Vowel: Its Nature and Structure. Phonetic Society of Japan, Kaiseikan: Tokyo; 1941.
- [8]Stevens KN, Kasowski S, Fant CGM: An electrical analog of the vocal tract. J Acoust Soc Am 2005, 25:734-742.
- [9]Mermelstein P: Articulatory model for the study of speech production. J Acoust Soc Am 1973, 53(4):1070-1082.
- [10]Kelly JL, Lochbaum CC: Speech synthesis. In Proceedings of the Fourth International Congress on Acoustics. Copenhagen, Denmark; 1962:1-4. [Paper G42]
- [11]Iskarous K: Patterns of tongue movement. J Phon 2005, 33:363-381.
- [12]Schonle PW, Grabe K, Wenig P, Hohne J, Schrader J, Conrad B: Electromagnetic articulography: use of alternating magnetic fields for tracking movements of multiple points inside and outside the vocal tract. Brain Lang 1987, 31:26-35.
- [13]Iskarous K: Vowel constrictions are recoverable from formants. J Phon 2010, 38:375-387.
- [14]Shawker TH, Sonies BC: Tongue movement during speech: a real-time ultrasound evaluation. J Clin Ultrasound 1984, 12:125-133.
- [15]Stone M: A three-dimensional model of tongue movement based on ultrasound and x-ray microbeam data. J Acoust Soc Am 1990, 87:2207-2217.
- [16]Demolin D, Hassid S, Metens T, Soquet A: Real-time MRI and articulatory coordination in speech. Comptes Rendus Biologies 2002, 325:547-556.
- [17]Engwall O: From real-time MRI to 3d tongue movements. Interspeech 2004.
- [18]Baer T, Gore J, Boyce S, Nye P: Application of MRI to the analysis of speech production. Magn Reson Imaging 1987, 5:1-7.
- [19]Badin P, Serrurier A: Three-dimensional modeling of speech organs: Articulatory data and models. Trans Tech Commit Psychol Physiol Acoustics 2006, 36:421-426.
- [20]Behrends J, Hoole P, Leinsinger GL, Tillmann HG, Hahn K, Reiser M, Wismüller A: A segmentation and analysis method for MRI data of the human vocal tract. In Bildverarbeitung für die Medizin. Berlin Heidelberg: Springer; 2003:186-190.
- [21]Engwall O: A 3d tongue model based on MRI data. Interspeech 2000, 901-904.
- [22]Ventura SMR, Vasconcelos MJM, Freitas DRS, Ramos IMA, Tavares JMR: Speaker-specific articulatory assessment and measurements during Portuguese speech production based on magnetic resonance Images. Language Acquisition 2012.
- [23]Rua Ventura SM, Freitas DRS, Ramos IMA, Tavares JMR: Morphologic differences in the vocal tract resonance cavities of voice professionals: an MRI-based study. J Voice 2013, 27(2):132-140.
- [24]Palo P, Aalto D, Aaltonen O, Happonen R-P, Malinen J, Vainio M: Articulating finnish vowels: results from MRI and sound data. Linguistica Uralica 2012, 3:194-199.
- [25]Vampola T, Horacek J, Svec JG: FE modeling of human vocal tract acoustics. Part I: Production of Czech vowels. Acta Acustica United Acustica 2008, 94:433-447.
- [26]Takemoto H, Honda K, Masaki S, Shimada Y, Fujimoto I: Measurement of Temporal Changes in Vocal Tract Area Function during a continuous vowel sequence using a 3D Cine-MRI Technique. 6th Int Seminar on Speech Production 2003, 284-289.
- [27]Narayanan S, Nayak K, Lee S, Sethy A, Byrd D: An approach to real-time magnetic resonance imaging for speech production. J Acoust Soc Am 2004, 115:1771.
- [28]Ma’dy K, Sader R, Zimmermann A, Hoole P, Beer A, Zeilhofer H, Hannig C: Assessment of consonant articulation in glossectomee speech by dynamic MRI. Paper presented at the Proceedings of 7th International Conference on Spoken Language Processing (ICSLP), Denver, CO 2002, 961-964.
- [29]Story BH: Comparison of magnetic resonance imaging-based vocal tract area functions obtained from the same speaker in 1994 and 2002. J Acoustical Soc Am 2008, 123:327-335.
- [30]Story BH, Titze IR, Hoffman EA: Vocal tract area functions from magnetic resonance imaging. J Acoustical Soc Am 1996, 100:537-554.
- [31]Stavness I, Lloyd JE, Payan Y, Fels S: Coupled hard–soft tissue simulation with contact and constraints applied to jaw–tongue–hyoid dynamics. Int J Numerical Methods Biomed Eng 2011, 27:367-390.
- [32]Gick B, Stavness I, Chiu C, Fels S: Categorical variation in lip posture is determined by quantal biomechanical-articulatory relations. Canadian Acoustics 2011, 39:178-179.
- [33]Story BH, Bunton K: Simulation and identification of vowels based on a time-varying model of the vocal tract area function. In Vowel Inherent Spectral Change. Berlin Heidelberg: Springer; 2013:155-174.
- [34]Guzman M, Laukkanen A-M, Krupa P, Horáček J, Svec JG, Geneid A: Vocal tract and glottal function during and after vocal exercising with resonance tube and straw. J Voice 2013, 27:523. e519-523. e534
- [35]Kivelä A, Kuortti J, Malinen J: Resonances and mode shapes of the human vocal tract during vowel production. Proceedings of 26th Nordic Seminar on Computational Mechanics, to appear 2013.
- [36]Aalto D, Malinen J, Palo P, Aaltonen O, Vainio M, Happonen R-P, Parkkola R, Saunavaara J: Recording Speech Sound and Articulation in MRI. Biodevices 2011, 168-173.
- [37]Aalto D, Aaltonen O, Happonen R-P, Jääsaari P, Kivelä A, Kuortti J, Luukinen J-M, Malinen J, Murtola T, Parkkola R: Measurement of acoustic and anatomic changes in oral and maxillofacial surgery patients. arXiv preprint arXiv:13092811 2013.
- [38]Mathiak K, Klose U, Ackermann H, Hertrich I, Kincses WE, Grodd W: Stroboscopic articulography using fast magnetic resonance imaging. Int J Lang Commun Disord 2000, 35:419-425.
- [39]Vasconcelos MJ, Ventura SM, Freitas DR, Tavares JMR: Inter-speaker speech variability assessment using statistical deformable models from 3.0 Tesla magnetic resonance images. Proc Inst Mech Eng H J Eng Med 2012, 226:185-196.
- [40]Crary MA, Kotzur IM, Gauger J, Gorham M, Burton S: Dynamic magnetic resonance imaging in the study of vocal tract configuration. J Voice 1996, 10:378-388.
- [41]Di Girolamo M, Corsetti A, Laghi A, Ferone E, Iannicelli E, Rossi M, Pavone P, Passariello R: Assessment with magnetic resonance of laryngeal and oropharyngeal movements during phonation. La Radiologia Medica 1996, 92:33.
- [42]Engwall O: A revisit to the Application of MRI to the Analysis of Speech Production-Testing our assumptions. Proc of 6th International Seminar on Speech Production 2003, 43-48.
- [43]Ventura SMR, Freitas DRS, Tavares JMR: Toward dynamic magnetic resonance imaging of the vocal tract during speech production. J Voice 2011, 25:511-518.
- [44]Baer T, Gore JC, Gracco LC, Nye PW: Analysis of vocal-tract shape and dimensions using magnetic-resonance-imaging - vowels. J Acoust Soc Am 1991, 90:799-828.
- [45]Xiaofeng L, Murano E, Stone M, Prince JL: Harp tracking refinement using seeded region growing. Biomedical Imaging: From Nano to Macro, 2007 ISBI 2007 4th IEEE International Symposium on; 12–15 April 2007 2007, 372-375.
- [46]Stone M, Davis E, Douglas A, Ness Aiver M, Gullapalli R, Levine W, Lundberg A: Modeling tongue surface contours from cine-mri images. J Speech Lang Hear Res 2001, 44:1026-1040.
- [47]Ma Z, Tavares JMR, Jorge RN, Mascarenhas T: A review of algorithms for medical image segmentation and their applications to the female pelvic cavity. Comput Methods Biomech Biomed Engin 2010, 13:235-246.
- [48]Vasconcelos MJM, Ventura SR, Freitas DRS, Tavares JMR: Using statistical deformable models to reconstruct vocal tract shape from magnetic resonance images. Proc Inst Mech Eng H J Eng Med 2010, 224:1153-1163.
- [49]Ventura S, Freitas D, Tavares JMR: Application of MRI and biomedical engineering in speech production study. Comput Methods Biomech Biomed Engin 2009, 12:671-681.
- [50]Osher SJ, Sethian JA: Fronts propagation with curvature dependent speed: Algorithms based on Hamilton-Jacobi formulations. J Comput Phys 1988, 79:12-49.
- [51]Malladi R, Sethian JA, Vemuri BC: Shape modeling with front propagation - a level set approach. IEEE Trans Pattern Anal Mach Intell 1995, 17:158-175.
- [52]Caselles V, Kimmel R, Sapiro G: Geodesic active contours. Computer Vision, 1995 Proceedings, Fifth International Conference on; 20–23 Jun 1995 1995, 694-699.
- [53]Kichenassamy S, Kumar A, Olver P, Tannenbaum A, Yezzi A: Conformal curvatures flows: From phase transitions to active vision. Arch Rational Mech Anal 1996, 134:275-301.
- [54]Siddiqi K, Lauziere YB, Tannenbaum A, Zucker SW: Area and length minimizing flows for shape segmentation. IEEE Trans Image Process 1998, 7:433-443.
- [55]Xu CY, Prince JL: Snakes, shapes, and gradient vector flow. IEEE Trans Image Process 1998, 7:359-369.
- [56]Kass M, Witkin A, Terzopoulos D: Snakes - active contour models. Int J Comput Vis 1987, 1:321-331.
- [57]Williams DJ, Shah M: A fast algorithm for active contours and curvature estimation. Cvgip-Image Underst 1992, 55:14-26.
- [58]Boersma P, Weenink D: Praat: doing phonetics by computer (Version 5.1. 05) [Computer program]. 2009. Retrieved May 1
- [59]Browman CP, Goldstein L: Articulatory gestures as phonological units. Phonology 1989, 6:201-251.
- [60]Tiilikainen NP: A Comparative Study of Active Contour Snakes. Denmark: Copenhagen University; 2007.
- [61]Johnson MH, Pizza S, Alwan A, Cha JS: Vowell category dependence of the relationship between palate height, tongue height, and oral area. J Speech Lang Hear Res 2003, 46:738-753.
- [62]Mokhtari P, Kitamura T, Takemoto H, Honda K: Principal components of vocal-tract area functions and inversion of vowels by linear regression of cepstrum coefficients. J Phon 2007, 35:20-39.