Machine learning based classification of cells into chronological stages using single-cell transcriptomics

Sci Rep. 2018 Nov 21;8(1):17156. doi: 10.1038/s41598-018-35218-5.


Age-associated deterioration of cellular physiology leads to pathological conditions. The ability to detect premature aging could provide a window for preventive therapies against age-related diseases. However, the techniques for determining cellular age are limited, as they rely on a limited set of histological markers and lack predictive power. Here, we implement GERAS (GEnetic Reference for Age of Single-cell), a machine learning based framework capable of assigning individual cells to chronological stages based on their transcriptomes. GERAS displays greater than 90% accuracy in classifying the chronological stage of zebrafish and human pancreatic cells. The framework demonstrates robustness against biological and technical noise, as evaluated by its performance on independent samplings of single-cells. Additionally, GERAS determines the impact of differences in calorie intake and BMI on the aging of zebrafish and human pancreatic cells, respectively. We further harness the classification ability of GERAS to identify molecular factors that are potentially associated with the aging of beta-cells. We show that one of these factors, junba, is necessary to maintain the proliferative state of juvenile beta-cells. Our results showcase the applicability of a machine learning framework to classify the chronological stage of heterogeneous cell populations, while enabling detection of candidate genes associated with aging.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Age Factors
  • Animals
  • Cytological Techniques / methods*
  • Gene Expression Profiling*
  • Humans
  • Insulin-Secreting Cells / classification*
  • Machine Learning*
  • Single-Cell Analysis / methods*
  • Zebrafish