PCAtest: testing the statistical significance of Principal Component Analysis in R
- PMID: 35194531
- PMCID: PMC8858582
- DOI: 10.7717/peerj.12967
PCAtest: testing the statistical significance of Principal Component Analysis in R
Abstract
Principal Component Analysis (PCA) is one of the most broadly used statistical methods for the ordination and dimensionality-reduction of multivariate datasets across many scientific disciplines. Trivial PCs can be estimated from data sets without any correlational structure among the original variables, and traditional criteria for selecting non-trivial PC axes are difficult to implement, partially subjective or based on ad hoc thresholds. PCAtest is an R package that implements permutation-based statistical tests to evaluate the overall significance of a PCA, the significance of each PC axis, and of contributions of each observed variable to the significant axes. Based on simulation and empirical results, I encourage R users to routinely apply PCAtest to test the significance of their PCA before proceeding with the direct interpretation of PC axes and/or the utilization of PC scores in subsequent evolutionary and ecological analyses.
Keywords: PCAtest; Permutation; Principal component analysis; R function; Statistical significance.
© 2022 Camargo.
Conflict of interest statement
The author declared that they have no competing interests.
Figures
Similar articles
-
Statistical significance of variables driving systematic variation in high-dimensional data.Bioinformatics. 2015 Feb 15;31(4):545-54. doi: 10.1093/bioinformatics/btu674. Epub 2014 Oct 21. Bioinformatics. 2015. PMID: 25336500 Free PMC article.
-
Identifying critical variables of principal components for unsupervised feature selection.IEEE Trans Syst Man Cybern B Cybern. 2005 Apr;35(2):339-44. doi: 10.1109/tsmcb.2004.843269. IEEE Trans Syst Man Cybern B Cybern. 2005. PMID: 15828661
-
Diagnosis of Dry Eye Disease Using Principal Component Analysis: A Study in Animal Models of the Disease.Curr Eye Res. 2021 May;46(5):622-629. doi: 10.1080/02713683.2020.1830115. Epub 2021 Jan 14. Curr Eye Res. 2021. PMID: 33445973
-
A statistical test and sample size recommendations for comparing community composition following PCA.PLoS One. 2018 Oct 24;13(10):e0206033. doi: 10.1371/journal.pone.0206033. eCollection 2018. PLoS One. 2018. PMID: 30356253 Free PMC article.
-
Principals about principal components in statistical genetics.Brief Bioinform. 2019 Nov 27;20(6):2200-2216. doi: 10.1093/bib/bby081. Brief Bioinform. 2019. PMID: 30219892 Review.
Cited by
-
Morphological differentiation of peritumoral brain zone microglia.PLoS One. 2024 Mar 7;19(3):e0297576. doi: 10.1371/journal.pone.0297576. eCollection 2024. PLoS One. 2024. PMID: 38451958 Free PMC article.
-
Shift in virus composition in honeybees (Apis mellifera) following worldwide invasion by the parasitic mite and virus vector Varroa destructor.R Soc Open Sci. 2024 Jan 10;11(1):231529. doi: 10.1098/rsos.231529. eCollection 2024 Jan. R Soc Open Sci. 2024. PMID: 38204792 Free PMC article.
-
Two new karst-adapted species in the Cyrtodactyluspulchellus group (Reptilia, Gekkonidae) from southern Thailand.Zookeys. 2023 Sep 14;1179:313-352. doi: 10.3897/zookeys.1179.109712. eCollection 2023. Zookeys. 2023. PMID: 37745621 Free PMC article.
-
Estimating the number of principal components via Split-Half Eigenvector Matching (SHEM).MethodsX. 2023 Jul 8;11:102286. doi: 10.1016/j.mex.2023.102286. eCollection 2023 Dec. MethodsX. 2023. PMID: 37519949 Free PMC article.
-
Use of novel structural features to identify urinary biomarkers during acute kidney injury that predict progression to chronic kidney disease.BMC Nephrol. 2023 Jun 19;24(1):178. doi: 10.1186/s12882-023-03196-0. BMC Nephrol. 2023. PMID: 37331957 Free PMC article.
References
-
- Choi J, Yang X. Asymptotic properties of correlation-based principal component analysis. Journal of Econometrics. in press doi: 10.1016/j.jeconom.2021.08.003. - DOI
-
- Dijksterhuis GB, Heiser WJ. The role of permutation tests in exploratory multivariate data analysis. Food Quality and Preference. 1995;6(4):263–270. doi: 10.1016/0950-3293(95)00025-9. - DOI
-
- Dobriban E. Permutation methods for factor analysis and PCA. The Annals of Statistics. 2020;48(5):2824–2847. doi: 10.1214/19-AOS1907. - DOI
-
- Efron B. Bootstrap methods: another look at the Jacknife. The Annals of Statistics. 1979;7(1):1–26. doi: 10.1214/aos/1176344552. - DOI
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
