Phase Vocoder
01 November 1966
Analysis-synthesis methods for speech transmission aim at efficient encoding of voice signals. A customary approach is to represent separately the important features of vocal excitation and tract transmission.1 The well-known channel vocoder of Dudley 2 derives signals which fall into this dichotomy. The tract transmission is described by values of the short-time amplitude spectrum measured at discrete frequencies, and the excitation is described in terms of the fundamental frequency of the voice and the voiced-unvoiced character of the signal. Efforts to solve the long-standing problem of good-quality synthesis from such representations have centered on adequate analysis and specification of the excitation data. One advance in surmounting the difficulties connected with pitch and voiced-unvoiced extraction is the voice-excited vocoder (VEV). 3 This device relys on transmission of an unprocessed subband of the original speech to carry the excitation information. The spectral envelope information is transmitted as in the channel vocoder by a number of slowlyvarying signals. Through accurate preservation of excitation details, a transmission of improved quality and modest bandsaving is achieved. The present paper proposes another technique for encoding speech to achieve comparable bandsaving and acceptable voice quality. In addition, the technique provides a convenient means for compression and expansion of the time dimension. The method specifies the speech signal in terms of its short-time amplitude and phase spectra.