Improved detection of differentially represented DNA barcodes for high-throughput clonal phenomics

Mol Syst Biol. 2020 Mar;16(3):e9195. doi: 10.15252/msb.20199195.

Abstract

Cellular DNA barcoding has become a popular approach to study heterogeneity of cell populations and to identify clones with differential response to cellular stimuli. However, there is a lack of reliable methods for statistical inference of differentially responding clones. Here, we used mixtures of DNA-barcoded cell pools to generate a realistic benchmark read count dataset for modelling a range of outcomes of clone-tracing experiments. By accounting for the statistical properties intrinsic to the DNA barcode read count data, we implemented an improved algorithm that results in a significantly lower false-positive rate, compared to current RNA-seq data analysis algorithms, especially when detecting differentially responding clones in experiments with strong selection pressure. Building on the reliable statistical methodology, we illustrate how multidimensional phenotypic profiling enables one to deconvolute phenotypically distinct clonal subpopulations within a cancer cell line. The mixture control dataset and our analysis results provide a foundation for benchmarking and improving algorithms for clone-tracing experiments.

Keywords: DNA barcoding; clone tracing; fate mapping; lineage tracing; phenomics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cell Line, Tumor
  • Clone Cells
  • DNA Barcoding, Taxonomic / methods*
  • HEK293 Cells
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Neoplasms / genetics*
  • Phenomics / methods*
  • Selection, Genetic
  • Sequence Analysis, DNA