A Pure L1-norm Principal Component Analysis

Jp Brooks; Jh Dulá; El Boone

doi:10.1016/j.csda.2012.11.007

A Pure L1-norm Principal Component Analysis

Comput Stat Data Anal. 2013 May 1:61:83-98. doi: 10.1016/j.csda.2012.11.007.

Authors

Jp Brooks¹, Jh Dulá, El Boone

Affiliation

¹ Department of Statistical Sciences and Operations Research, Virginia Commonwealth University, Richmond, VA 23284.

Abstract

The L₁ norm has been applied in numerous variations of principal component analysis (PCA). L₁-norm PCA is an attractive alternative to traditional L₂-based PCA because it can impart robustness in the presence of outliers and is indicated for models where standard Gaussian assumptions about the noise may not apply. Of all the previously-proposed PCA schemes that recast PCA as an optimization problem involving the L₁ norm, none provide globally optimal solutions in polynomial time. This paper proposes an L₁-norm PCA procedure based on the efficient calculation of the optimal solution of the L₁-norm best-fit hyperplane problem. We present a procedure called L₁-PCA* based on the application of this idea that fits data to subspaces of successively smaller dimension. The procedure is implemented and tested on a diverse problem suite. Our tests show that L₁-PCA* is the indicated procedure in the presence of unbalanced outlier contamination.

Keywords: L1 regression; linear programming; principal component analysis.

Abstract

Grants and funding