Genetic basis of virulence attenuation revealed by comparative genomic analysis of Mycobacterium tuberculosis strain H37Ra versus H37Rv

PLoS One. 2008 Jun 11;3(6):e2375. doi: 10.1371/journal.pone.0002375.


Tuberculosis, caused by Mycobacterium tuberculosis, remains a leading infectious disease despite the availability of chemotherapy and BCG vaccine. The commonly used avirulent M. tuberculosis strain H37Ra was derived from virulent strain H37 in 1935 but the basis of virulence attenuation has remained obscure despite numerous studies. We determined the complete genomic sequence of H37Ra ATCC25177 and compared that with its virulent counterpart H37Rv and a clinical isolate CDC1551. The H37Ra genome is highly similar to that of H37Rv with respect to gene content and order but is 8,445 bp larger as a result of 53 insertions and 21 deletions in H37Ra relative to H37Rv. Variations in repetitive sequences such as IS6110 and PE/PPE/PE-PGRS family genes are responsible for most of the gross genetic changes. A total of 198 single nucleotide variations (SNVs) that are different between H37Ra and H37Rv were identified, yet 119 of them are identical between H37Ra and CDC1551 and 3 are due to H37Rv strain variation, leaving only 76 H37Ra-specific SNVs that affect only 32 genes. The biological impact of missense mutations in protein coding sequences was analyzed in silico while nucleotide variations in potential promoter regions of several important genes were verified by quantitative RT-PCR. Mutations affecting transcription factors and/or global metabolic regulations related to in vitro survival under aging stress, and mutations affecting cell envelope, primary metabolism, in vivo growth as well as variations in the PE/PPE/PE-PGRS family genes, may underlie the basis of virulence attenuation. These findings have implications not only for improved understanding of pathogenesis of M. tuberculosis but also for development of new vaccines and new therapeutic agents.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Amino Acid Sequence
  • Bacterial Proteins / chemistry
  • Bacterial Proteins / genetics
  • Cell Membrane / genetics
  • Chromosomes, Bacterial / genetics
  • Genes, Bacterial
  • Genetic Variation
  • Genome, Bacterial / genetics*
  • Molecular Sequence Data
  • Mutation / genetics
  • Mycobacterium tuberculosis / classification
  • Mycobacterium tuberculosis / genetics*
  • Mycobacterium tuberculosis / metabolism
  • Mycobacterium tuberculosis / pathogenicity*
  • Phylogeny
  • Repetitive Sequences, Nucleic Acid
  • Sequence Alignment
  • Transcription Factors / genetics
  • Virulence / genetics


  • Bacterial Proteins
  • Transcription Factors