Note on Some Factors Affecting Performance of Dynamic Time Warping Algorithms for Isolated Word Recognition

01 March 1982

New Image

The technique of using dynamic time warping (DTW) to align (in time) a test and reference pattern for isolated word recognition has 363 been shown to be effective in a wide variety of recognition systems.1"6 Although a great deal of investigation has been made into "optimal" DTW algorithms, there still remains uncertainty as to how best to specify the factors of the DTW implementation to achieve the highest recognition accuracy. In this paper, we consider two of these factors, namely, the endpoint constraint at the end of the warp and the local path constraints. Previous investigations were made by Rabiner et al.4 and Myers et al.6 on the effects on word recognition accuracy of both loosening endpoint constraints and using different local path constraints. However, the work4 on DTW algorithms with relaxed endpoint constraints considered only two specific variations, namely, the unconstrained endpoint case (the UE2-1 algorithm), and the local minimum case (the UELM algorithm). Neither of these algorithms considered relaxing just the endpoint constraint at the end of the word. This constraint is very important since replications of isolated words generally have the most variability at the end of the word. Similarly, although various local path constraints were considered in Ref. 6, the use of subword units in the reference patterns raises again the question as to whether increased flexibility in the choice of warping path leads to improvements in recognition scores. We show that by loosening the endpoint constraint at the end of the utterance, a small but consistent increase in recognition accuracy is obtained.