The voicing feature for stop consonants: recognition experiments with continuously spoken alphabets
01 October 2003
We consider the possibility of incorporating phonetic features into a statistically based speech recognizer. We develop a two pass strategy for recognition with a hidden Markov model based first pass followed by a second pass that performs an alternative analysis using class-specific features. For the voiced/voiceless distinction on stops for an alphabet recognition task, we show that a perceptually and linguistically motivated acoustic feature exists (the voice onset time (VOT)). We perform acoustic-phonetic analyses demonstrating that this feature provides superior separability to the traditional spectral features. Further, the VOT can be automatically extracted from the speech signal. We describe several such algorithms that can be incorporated into our two pass recognition strategy to reduce error rates by as much as 53% over a baseline HMM recognition system. (C) 2002 Published by Elsevier B.V.