Modulation features for speech recognition

01 January 2002

New Image

Automatic speech recognition (ASR) systems can benefit from including into their acoustic processing part new features that account for various nonlinear and time-varying phenomena during speech production. In this paper, we develop robust methods to extract novel acoustic features from speech signals of the modulation type based on time-varying models for speech analysis. Further, we integrate the new speech features with the standard linear ones (mel-frequency cesptrum) to develop a augmented set of acoustic features and demonstrate its efficacy by showing significant improvements in HMM-based word recognition over the TIMIT database.