Data-driven Pronunciation Modeling for Non-Native Speakers using Association Strength Between Phones
01 January 2000
In this paper we present an approach to modeling pronunciation variation, particularly for non-native speakers, by modifying the lexicon. In this way we can model several speakers simultaneously, i.e., use the same lexicon and the same acoustic models for all speakers. We use a data-driven approach i.e., methods based solely on the reference lexicon, the recognizer's acoustic models, and the acoustic data. We propose a new alignment procedure using an estimated relation measure between the phones in the reference transcription and the alternative transcription of the new speaker data. This measure discovers significant correspondence between the phones in the two transcription, we present this measure as association strength. Rules are extracted from the alignment and used to derive pronunciation variants. Using a rule pruning algorithm, the most beneficial rules are used to modify the lexicon. Preliminary experiments using the new alignment algorithm on the Wall Street Journal non-native speaker database show a decrease in work error rate (WER) from 29.2% to 28.6%. We believe a novel rule pruning method sketched in this draft will give further decrease in WER.