Integrated bias removal techniques for robust speech recognition
01 July 1999
In this paper, we present a family of maximum likelihood (ML) techniques that aim at reducing an acoustic mismatch between the training and testing conditions of hidden Markov model (HMM)-based automatic speech recognition (ASR) systems. Our study is conducted in two phases. In the first phase, we evaluate two classes of robustness techniques; those that represent the acoustic mismatch for the entire utterance as a single additive bias and those that represent the mismatch as a non-stationary bias. In the second phase, we propose a codebook-based stochastic matching (CBSM) approach for bias removal both at the feature level and at the model level. CBSM associates each bias with an ensemble of HMM mixture components that share similar acoustic characteristics. It is integrated with hierarchical signal bias removal and further extended to account for n-best candidates. Experimental results on connected digits, recorded over a cellular network, shows that incorporating bias removal reduces both the word and string error rates by about 12% and 16%, respectively, when using a global bias, and 36% and 31%, respectively, when using a non-stationary bias. (C) 1999 Academic Press.