A Random Matrix-Theoretic Approach to Handling Singular Covariance Estimates

01 September 2011

New Image

In many practical situations we would like to estimate the covariance matrix of a set of variables from an insufficient amount of data. More specifically, if we have a set of N independent identically distributed measurements of an M dimensional random vector with N M, the maximum-likelihood estimate for the covariance matrix is the sample covariance matrix, but here this estimate is singular and therefore fundamentally bad. We present a radically new approach to deal with this situation. Let K be the classical sample covariance matrix. Fix a parameter 1=L=N and consider an ensemble of LxM random unitary matrices, {Phi}, having Haar probability measure (isotropically random). Pre- and post-multiply K by Phi, and by the conjugate-transpose of Phi respectively, to produce a LxL reduced-dimension covariance estimate. The new estimate, cov_L(K), is obtained by a) projecting the reduced covariance estimate out (to MxM) through pre- and post-multiplication by the conjugate-transpose of Phi, and by Phi respectively, and b) taking the expectation over the unitary ensemble. The new estimate, invcov_L(K), is obtained by a) inverting the reduced covariance estimate, b) projecting the inverse out (to MxM) through pre- and post-multiplication by the conjugate-transpose of Phi, and by Phi respectively, and c) taking the expectation over the unitary ensemble. The estimate cov is equivalent to diagonal loading. The estimate invcov retains the original eigenvectors, it transforms the zero-eigenvalues into equal positive values, and it modifies nonuniformly the non-zero eigenvalues. We have a closed-form analytical expression for invcov in terms of the eigenvector/eigenvalue decomposition of the sample covariance. We motive the use of invcov through applications to linear estimation, supervised learning, and high-resolution spectral estimation. We also compare the performance of the estimator invcov with respect to diagonal loading.