HMM Clustering for Connected Word Recognition

01 January 1989

New Image

It has been shown that when there are a sufficiently large number of training tokens of a given speech recognition unit (e.g., words, phones, syllables, etc.), it is generally worthwhile clustering the training tokens into two or more clusters and then creating either templates or statistical models from each individual cluster. 

Such techniques have been used successfully for isolated and connected word recognition for a number of years. Almost all of the clustering techniques, to date, have been based on conventional template-based techniques whereby a training set is broken into clusters on the basis of time aligned, pairwise distance scores between each pair of tokens in the training set.