An information-theoretic approach for measuring the distance of organ tissue samples using their transcriptomic signatures

Bioinformatics. 2021 Jan 29;36(21):5194-5204. doi: 10.1093/bioinformatics/btaa654.

Abstract

Motivation: Recapitulating aspects of human organ functions using in vitro (e.g. plates, transwells, etc.), in vivo (e.g. mouse, rat, etc.), or ex vivo (e.g. organ chips, 3D systems, etc.) organ models is of paramount importance for drug discovery and precision medicine. It will allow us to identify potential side effects and test the effectiveness of new therapeutic approaches early in their design phase, and will inform the development of better disease models. Developing mathematical methods to reliably compare the 'distance/similarity' of organ models from/to the real human organ they represent is an understudied problem with important applications in biomedicine and tissue engineering.

Results: We introduce the Transcriptomic Signature Distance (TSD), an information-theoretic distance for assessing the transcriptomic similarity of two tissue samples, or two groups of tissue samples. In developing TSD, we are leveraging next-generation sequencing data as well as information retrieved from well-curated databases providing signature gene sets characteristic for human organs. We present the justification and mathematical development of the new distance and demonstrate its effectiveness and advantages in different scenarios of practical importance using several publicly available RNA-seq datasets.

Availability and implementation: The computation of both TSD versions (simple and weighted) has been implemented in R and can be downloaded from https://github.com/Cod3B3nd3R/Transcriptomic-Signature-Distance.

Contact: dimitris.manatakis@emulatebio.com.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Databases, Factual
  • High-Throughput Nucleotide Sequencing*
  • Humans
  • Mice
  • RNA-Seq
  • Rats
  • Software
  • Transcriptome*
  • Whole Exome Sequencing