Pathogen discovery from human tissue by sequence-based computational subtraction

Genomics. 2003 Mar;81(3):329-35. doi: 10.1016/s0888-7543(02)00043-5.


We have recently reported a new pathogen discovery approach, "computational subtraction". With this approach, non-human transcripts are detected by sequencing cDNA libraries from infected tissue and eliminating those transcripts that match the human genome. We show now that this method is experimentally feasible. We generated a cDNA library from a tissue sample of post-transplant lymphoproliferative disorder (PTLD). 27,840 independent cDNA sequences were filtered by computational subtraction against the known human sequence to identify 32 nonmatching transcripts. Of these, 22 (0.1%) were found to be amplifiable from both infected and noninfected samples and were inferred to be human DNA not yet contained in the available human genome sequence. The remaining 10 sequences could be amplified only from Epstein-Barr virus (EBV)-infected tissues. All 10 corresponded to the known EBV sequence. This proof-of-principle experiment demonstrates that computational subtraction can detect pathogenic microbes in primary human-diseased tissue.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • DNA, Complementary
  • Herpesvirus 4, Human / genetics
  • Herpesvirus 4, Human / isolation & purification*
  • Humans
  • Polymerase Chain Reaction
  • Subtraction Technique*


  • DNA, Complementary