Evaluation of Front-End Features and Noise Compensation Methods for Robust Mandarin Speech Recognition
01 January 2001
This paper describes speaker-independent speech recognition experiments concerning acoustic front-end processing on a telephone database that was recorded in various dialect regions in China. In this paper, three different features based on human voice production, perception and auditory systems have been evaluated for Mandarin speech recognition. Experimental comparisons showed that auditory-filtered cepstral coefficients outperforms that other type of features. When speech recognizers are deployed in telephone services, they often encounter variable acoustic mismatches which significantly deteriorate their performance. Three different channel equalization techniques have been explored in this study to decrease this mismatch, hence improving the recognition accuracy. We present results with various noise compensation methods based on hierarchical cepstral mean subtaction and signal bias removal.