Vocal Tract Visualization

The goal of vocal tract visualization is to allow us to view the vocal tract during unimpeded speech production. Knowledge about the vocal tract is useful for clinical diagnosis of speech disorders, for providing more accurate data for acoustic modeling of speech production, and for studying the mechanics of vocal tract.


Tongue Surface Reconstruction from Ultrasound Data

Ultrasound data collection is well suited to the vocal tract for several reasons. Ultrasound is non-invasive. We can collect tongue surface scans with a transducer held under the chin. With a compressible standoff, the tongue and jaw are free to make normal speech movements. Ultrasound is safe. It uses high frequency (~5mhz) sound which pose no danger to the subjects in repeated trials. Ultrasound is fast. Ultrasound slices can be collected at video rates (30fps).


Ultrasound Data

Ultrasound coronal (side to side) scan of tongue surface during speech .

Using a developmental 3D ultrasound transducer, we collect 60 coronal slices, oriented radially in space. The 3D transducer has a standard 128 crystal array (capable of collecting a single 2D slice) mounted on a motorized pivot, allowing it to rotate in the third dimension. It collects up to 60 slices, each 1 degree apart. This process takes about 10 seconds.

The upper white line in the above image is caused by the density change at the tissue/air interface at the surface of the tongue. Note the deep groove in the middle of the tongue (this is during the production of the sound "e").


Image Processing

Detection of the tongue surface in each ultrasound slice was done using a dynamic programming contour tracking algorithm [1]. This is done for each of the 2D coronal slices.


Surface Reconstruction

Reconstruction of the surface is simplified by the automated collection procedure. The rotation axis of the transducer is known, as is the rotation between slices. The 2D point sets for the tongue surface in each slice are returned to their relative 3D coordinates, and an interpolating bspline surface is fitted to the points.


Visualization

Reconstruction of the tongue during production on an "e".

The reconstructed tongue surface can be visualized to provide an intuitive grasp of the tongue shape. It can also be statistically measured for more quantitative analysis.


Research Concerns

Our current methods include detection of the 2D tongue surface in each ultrasound slice. I am currently seeking a method to use the neighboring slice data in a 3D surface detection process. Other future work includes doing reconstructions from multiple pass 2D collections of continuous speech.


References

[1] Michael Unser, and Maureen Stone. Automated detection of the tongue surface in sequences of ultrasound images. J. Acoust. Soc. Am.,(5), May 1992:3001-3007.


lundberg@speech.umaryland.edu

Jump to Andrew's Home Page