Geometric anomaly detection in data
- PMID: 32747569
- PMCID: PMC7443892
- DOI: 10.1073/pnas.2001741117
Geometric anomaly detection in data
Abstract
The quest for low-dimensional models which approximate high-dimensional data is pervasive across the physical, natural, and social sciences. The dominant paradigm underlying most standard modeling techniques assumes that the data are concentrated near a single unknown manifold of relatively small intrinsic dimension. Here, we present a systematic framework for detecting interfaces and related anomalies in data which may fail to satisfy the manifold hypothesis. By computing the local topology of small regions around each data point, we are able to partition a given dataset into disjoint classes, each of which can be individually approximated by a single manifold. Since these manifolds may have different intrinsic dimensions, local topology discovers singular regions in data even when none of the points have been sampled precisely from the singularities. We showcase this method by identifying the intersection of two surfaces in the 24-dimensional space of cyclo-octane conformations and by locating all of the self-intersections of a Henneberg minimal surface immersed in 3-dimensional space. Due to the local nature of the topological computations, the algorithmic burden of performing such data stratification is readily distributable across several processors.
Keywords: persistent cohomology; singularities; stratification inference.
Copyright © 2020 the Author(s). Published by PNAS.
Conflict of interest statement
The authors declare no competing interest.
Figures
Similar articles
-
Orientability and Diffusion Maps.Appl Comput Harmon Anal. 2011 Jul;31(1):44-58. doi: 10.1016/j.acha.2010.10.001. Appl Comput Harmon Anal. 2011. PMID: 21765628 Free PMC article.
-
On the Construction of Property Based Diabatizations: Diabolical Singular Points.J Phys Chem A. 2015 Dec 17;119(50):12383-91. doi: 10.1021/acs.jpca.5b07705. Epub 2015 Oct 7. J Phys Chem A. 2015. PMID: 26444643
-
Emergent Conformal Symmetry and Geometric Transport Properties of Quantum Hall States on Singular Surfaces.Phys Rev Lett. 2016 Dec 23;117(26):266803. doi: 10.1103/PhysRevLett.117.266803. Epub 2016 Dec 22. Phys Rev Lett. 2016. PMID: 28059543
-
Detecting phase transitions in collective behavior using manifold's curvature.Math Biosci Eng. 2017 Apr 1;14(2):437-453. doi: 10.3934/mbe.2017027. Math Biosci Eng. 2017. PMID: 27879108
-
Topology Applied to Machine Learning: From Global to Local.Front Artif Intell. 2021 May 14;4:668302. doi: 10.3389/frai.2021.668302. eCollection 2021. Front Artif Intell. 2021. PMID: 34056580 Free PMC article. Review.
Cited by
-
Homology of homologous knotted proteins.J R Soc Interface. 2023 Apr;20(201):20220727. doi: 10.1098/rsif.2022.0727. Epub 2023 Apr 26. J R Soc Interface. 2023. PMID: 37122282 Free PMC article.
-
TDAExplore: Quantitative analysis of fluorescence microscopy images through topology-based machine learning.Patterns (N Y). 2021 Oct 12;2(11):100367. doi: 10.1016/j.patter.2021.100367. eCollection 2021 Nov 12. Patterns (N Y). 2021. PMID: 34820649 Free PMC article.
-
The shape of cancer relapse: Topological data analysis predicts recurrence in paediatric acute lymphoblastic leukaemia.PLoS Comput Biol. 2023 Aug 14;19(8):e1011329. doi: 10.1371/journal.pcbi.1011329. eCollection 2023 Aug. PLoS Comput Biol. 2023. PMID: 37578973 Free PMC article.
-
Prediction of Specific Antibody- and Cell-Mediated Responses Using Baseline Immune Status Parameters of Individuals Received Measles-Mumps-Rubella Vaccine.Viruses. 2023 Feb 13;15(2):524. doi: 10.3390/v15020524. Viruses. 2023. PMID: 36851738 Free PMC article.
References
-
- Fefferman C., Mitter S., Narayanan H., Testing the manifold hypothesis. J. Am. Math. Soc. 29, 983–1049 (2016).
-
- Lee J. A., Verleysen M., Nonlinear Dimensionality Reduction (Springer-Verlag, 2008).
-
- Ringner Markus., What is principal component analysis?. Nat. Biotechnol. 26, 303–304 (2008). - PubMed
-
- Tenenbaum J. B., De Silva V., Langford J. C., A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000). - PubMed
-
- Sebastian Seung H., Lee D. D., The manifold ways of perception. Science 290, 2268–2269 (2000). - PubMed
Publication types
Grants and funding
LinkOut - more resources
Full Text Sources
