Predicting cell lineages using autoencoders and optimal transport

PLoS Comput Biol. 2020 Apr 28;16(4):e1007828. doi: 10.1371/journal.pcbi.1007828. eCollection 2020 Apr.


Lineage tracing involves the identification of all ancestors and descendants of a given cell, and is an important tool for studying biological processes such as development and disease progression. However, in many settings, controlled time-course experiments are not feasible, for example when working with tissue samples from patients. Here we present ImageAEOT, a computational pipeline based on autoencoders and optimal transport for predicting the lineages of cells using time-labeled datasets from different stages of a cellular process. Given a single-cell image from one of the stages, ImageAEOT generates an artificial lineage of this cell based on the population characteristics of the other stages. These lineages can be used to connect subpopulations of cells through the different stages and identify image-based features and biomarkers underlying the biological process. To validate our method, we apply ImageAEOT to a benchmark task based on nuclear and chromatin images during the activation of fibroblasts by tumor cells in engineered 3D tissues. We further validate ImageAEOT on chromatin images of various breast cancer cell lines and human tissue samples, thereby linking alterations in chromatin condensation patterns to different stages of tumor progression. Our results demonstrate the promise of computational methods based on autoencoding and optimal transport principles for lineage tracing in settings where existing experimental strategies cannot be used.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Breast Neoplasms
  • Cell Differentiation / physiology
  • Cell Line, Tumor
  • Cell Lineage*
  • Cell Nucleus / physiology
  • Chromatin / physiology
  • Coculture Techniques
  • Computational Biology / methods*
  • Female
  • Humans
  • Image Processing, Computer-Assisted
  • Reproducibility of Results
  • Single-Cell Analysis / methods*


  • Chromatin

Grant support

KDY was partially supported by the National Science Foundation (NSF) Graduate Research Fellowship and ONR (N00014-18-1-2765). The GVS laboratory thanks the Mechanobiology Institute (MBI), National University of Singapore (NUS), Singapore, and the Ministry of Education (MOE) Tier-3 Grant Program for funding. CU was partially supported by NSF (DMS-1651995), ONR (N00014-17-1-2147 and N00014-18-1-2765), a Sloan Fellowship, and a Simons Investigator Award. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.