Demonstration of Silence Detection and Vowel/Consonant Discrimination in Video Sequences

This is a demonstration page that visualises the results presented in this paper presented in December 2000 in Canberra.

There are 9 different video recordings prepared for this presentation, with 2 male speakers and one female. You can chose them from the following table. The table entries indicate the size of the video recording (the recording is streamed, so you'll not have to wait for it to complete download).

Normal speech rate Slow speech Whispering
Male 1 (2.35MB) (3.08MB) (1.57MB)
Male 2 (3.82MB) (3.00MB) (1.90MB)
Female (3.68MB) (3.02MB) (1.MB)

In a new window you'll see the applet that allows you to play the selected video sequence and simultaneously displays the outputs of the artificial neural network that was trained to distinguish between three different classes: silence, vowel or consonant.

In the graph, the correct network responses are drawn in green, the incorrect ones in red, while the non conclusive ones in yellow.

Created: 15.03.2002,
last modified: