milestones in text-to-speech conversion (Klatt 1987)
Part A: Development
Part B: Segmental synthesis by rule
Part C: Sentence prosody
Part D: Full text-to-speech conversion
This group contains the audio demonstrations accompanying the famous article:
Dennis H. Klatt (1987) Review of text-to-speech conversion for English, Journal of the Acoustical Society of America, 82 (3), pp.737-793.The original introduction to these demonstrations is as follows: "The enclosed 33 1/3 rpm recording contains illustrations of some of the milestones in the development of systems for text-to-speech conversion. (.) The assistance of H. David Maxey, Michael Hecker, John Holmes, Patrick Nye, Joe Olive, and James Flanagan is gratefully acknowledged. My thanks also go to Kenneth Stevens, who served as narrator."
ProcessingThe original flexible vinyl record was played on a Dual CS 5000 phonograph, and copied onto digital audio tape at 44.1 kHz. (This resulted in some echoes and pre-echoes: the signal from adjacent parts of the groove can sometimes be heard.) The DAT recordings were copied to AIFF audio files on an SGI Indy workstation, with direct digital input from the DAT recorder. During copying, the audio files were downsampled to 11.025 kHz (16 bits, mono). This was sufficient given the sound quality of the original record. Each item of the record corresponds to a single audio file.
listen to demonstration