Mosaic graphs and comparative genomics in phage communities

J Comput Biol. 2010 Sep;17(9):1315-26. doi: 10.1089/cmb.2010.0108.


Comparing the genomes of two closely related viruses often produces mosaics where nearly identical sequences alternate with sequences that are unique to each genome. When several closely related genomes are compared, the unique sequences are likely to be shared with third genomes, leading to virus mosaic communities. Here we present comparative analysis of sets of Staphylococcus aureus phages that share large identical sequences with up to three other genomes, and with different partners along their genomes. We introduce mosaic graphs to represent these complex recombination events, and use them to illustrate the breath and depth of sequence sharing: some genomes are almost completely made up of shared sequences, while genomes that share very large identical sequences can adopt alternate functional modules. Mosaic graphs also allow us to identify breakpoints that could eventually be used for the construction of recombination networks. These findings have several implications on phage metagenomics assembly, on the horizontal gene transfer paradigm, and more generally on the understanding of the composition and evolutionary dynamics of virus communities.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Bacteria / virology
  • Bacteriophages / genetics*
  • Base Sequence
  • Evolution, Molecular
  • Genome, Viral*
  • Genomics / methods*
  • Molecular Sequence Data
  • Multigene Family
  • Recombination, Genetic
  • Sequence Alignment