Speaker Independent Phonetic Transcription of Fluent Speech for Large Vocabulary Speech Recognition

Speaker independent phonetic transcription of fluent speech is performed using an ergodic continuously variable duration hidden Markov model (CVDHMM) to represent the acoustic, phonetic and phonotactic structure of speech. An important property of the model is that each of its fifty-one states is uniquely identified with a single phonetic unit. Thus, for any spoken utterance, a phonetic transcription is obtained from a dynamic programming (DP) procedure for finding the state sequence of maximum likelihood.

Select your country

Speaker Independent Phonetic Transcription of Fluent Speech for Large Vocabulary Speech Recognition