Use of pseudo-sample extraction and the projection technique to estimate the chemical rank of three-way data arrays

Anal Bioanal Chem. 2006 Apr;384(7-8):1493-500. doi: 10.1007/s00216-006-0307-7. Epub 2006 Mar 17.

Abstract

Determining the rank of a trilinear data array is a first step in subsequent trilinear component decomposition. Different from estimating the rank of bilinear data, it is more difficult to decide the significant number of component to fit the trilinear decompositions exactly. General methods of rank estimation utilize the information contained in the singular values but ignore information from eigenvectors. In this paper, a rank estimating method specifically for trilinear data arrays is proposed. It uses the idea of direct trilinear decomposition (DTLD) to compress the cube matrix into two pseudo sample matrices which are then decomposed by singular value decomposition. Two eigenvectors combined with the projection technique are used to estimate the rank of trilinear data arrays. Simulated trilinear data arrays with homoscedastic and heteroscedastic noise, different noise levels, high collinearity, and real three-way data arrays have been used to illustrate the feasibility of the proposed method. Compared with other factor-determining methods, for example use of the factor indication function (IND), residual percentage variance (RPV), and the two-mode subspace comparison approach (TMSC), the results showed that the new method can give more reliable answers under the different conditions applied.