First a warning - it isn't very good because it was necessary to drop plosive sounds to make it fit in less than 1K. However, for the right phrase it works very well.
The sad thing is that including the extra sound would only have pushed the size up to 1189 characters - but this is an entry in the latest JS1K competition and being under 1K characters is essential.
The speech synthesizer is based on a tiny formant synthesizer implemented in C++ by Stepanov Andrey. A formant synthesizer works by applying different input waveforms to a set of filters which modify the sound produced. This aims to model the way that the vocal chords provide an input to the vocal cavities which act as an acoustic filter.
In this case the input waveforms are simply a sawtooth or noise and most of the work is done by setting the parameters of the filter.
The cut down synthesizer supports The following sounds/phonemes:
It's SIGGRAPH so you expect a lot of amazing graphics, but Microsoft seems to cornering the market in wow. In this case, take any ordinary video camera and, with a small change, turn it into a really [ ... ]