Approaches for integrating heterogeneous RNA-seq data reveal cross-talk between microbes and genes in asthmatic patients

Genome Biol. 2020 Jun 22;21(1):150. doi: 10.1186/s13059-020-02033-z.

Abstract

Sputum induction is a non-invasive method to evaluate the airway environment, particularly for asthma. RNA sequencing (RNA-seq) of sputum samples can be challenging to interpret due to the complex and heterogeneous mixtures of human cells and exogenous (microbial) material. In this study, we develop a pipeline that integrates dimensionality reduction and statistical modeling to grapple with the heterogeneity. LDA(Latent Dirichlet allocation)-link connects microbes to genes using reduced-dimensionality LDA topics. We validate our method with single-cell RNA-seq and microscopy and then apply it to the sputum of asthmatic patients to find known and novel relationships between microbes and genes.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Validation Study

MeSH terms

  • Asthma / genetics
  • Asthma / microbiology*
  • Case-Control Studies
  • Computational Biology / methods*
  • Female
  • Humans
  • Male
  • Microbiota*
  • Middle Aged
  • Sequence Analysis, RNA*
  • Sputum / chemistry*
  • Sputum / cytology
  • Unsupervised Machine Learning