An empirical comparison of information-theoretic criteria in estimating the number of independent components of fMRI data

PLoS One. 2011;6(12):e29274. doi: 10.1371/journal.pone.0029274. Epub 2011 Dec 27.

Abstract

Background: Independent Component Analysis (ICA) has been widely applied to the analysis of fMRI data. Accurate estimation of the number of independent components of fMRI data is critical to reduce over/under fitting. Although various methods based on Information Theoretic Criteria (ITC) have been used to estimate the intrinsic dimension of fMRI data, the relative performance of different ITC in the context of the ICA model hasn't been fully investigated, especially considering the properties of fMRI data. The present study explores and evaluates the performance of various ITC for the fMRI data with varied white noise levels, colored noise levels, temporal data sizes and spatial smoothness degrees.

Methodology: Both simulated data and real fMRI data with varied Gaussian white noise levels, first-order auto-regressive (AR(1)) noise levels, temporal data sizes and spatial smoothness degrees were carried out to deeply explore and evaluate the performance of different traditional ITC.

Principal findings: Results indicate that the performance of ITCs depends on the noise level, temporal data size and spatial smoothness of fMRI data. 1) High white noise levels may lead to underestimation of all criteria and MDL/BIC has the severest underestimation at the higher Gaussian white noise level. 2) Colored noise may result in overestimation that can be intensified by the increase of AR(1) coefficient rather than the SD of AR(1) noise and MDL/BIC shows the least overestimation. 3) Larger temporal data size will be better for estimation for the model of white noise but tends to cause severer overestimation for the model of AR(1) noise. 4) Spatial smoothing will result in overestimation in both noise models.

Conclusions: 1) None of ITC is perfect for all fMRI data due to its complicated noise structure. 2) If there is only white noise in data, AIC is preferred when the noise level is high and otherwise, Laplace approximation is a better choice. 3) When colored noise exists in data, MDL/BIC outperforms the other criteria.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Empirical Research
  • Humans
  • Likelihood Functions
  • Magnetic Resonance Imaging*
  • Models, Theoretical
  • Principal Component Analysis