Structural Maximum a Posteriori Linear Regression for Unsupervised Speaker Adaptation

01 January 2000

New Image

In this paper, a new approach for model adaptation, extended maximum a posterior linear regression (EMAPLR), is described. EMAPLR is an extension to maximum a posterior linear regression (MAPLR) for transformation based model adaptation. The proposed approach has a close form solution under the elliptic symmetric matrix variate prior distributions, and it it effective in our large vocabulary speech recognition experiments. In standard MLLR or MAPLR, the transformation matrix W is estimated first and then the model parameters are adapted according to mu sub (new) = W mu and mu = [1, mu sub 1, mu sub 2, ..., mu aub N]. This is an indirect estimation of the transformed model parameters, and numerical errors in the estimatin process of W will carry over to the final results. Moreover, there are N x (N + 1) free parameters in W. Even with a certain amount of data, W at lower level tree node is often ill-conditioned and may not be solvable when data are sparse.