Comparative genomics tools applied to bioterrorism defence

Brief Bioinform. 2003 Jun;4(2):133-49. doi: 10.1093/bib/4.2.133.


Rapid advances in the genomic sequencing of bacteria and viruses over the past few years have made it possible to consider sequencing the genomes of all pathogens that affect humans and the crops and livestock upon which our lives depend. Recent events make it imperative that full genome sequencing be accomplished as soon as possible for pathogens that could be used as weapons of mass destruction or disruption. This sequence information must be exploited to provide rapid and accurate diagnostics to identify pathogens and distinguish them from harmless near-neighbours and hoaxes. The Chem-Bio Non-Proliferation (CBNP) programme of the US Department of Energy (DOE) began a large-scale effort of pathogen detection in early 2000 when it was announced that the DOE would be providing bio-security at the 2002 Winter Olympic Games in Salt Lake City, Utah. Our team at the Lawrence Livermore National Lab (LLNL) was given the task of developing reliable and validated assays for a number of the most likely bioterrorist agents. The short timeline led us to devise a novel system that utilised whole-genome comparison methods to rapidly focus on parts of the pathogen genomes that had a high probability of being unique. Assays developed with this approach have been validated by the Centers for Disease Control (CDC). They were used at the 2002 Winter Olympics, have entered the public health system, and have been in continual use for non-publicised aspects of homeland defence since autumn 2001. Assays have been developed for all major threat list agents for which adequate genomic sequence is available, as well as for other pathogens requested by various government agencies. Collaborations with comparative genomics algorithm developers have enabled our LLNL team to make major advances in pathogen detection, since many of the existing tools simply did not scale well enough to be of practical use for this application. It is hoped that a discussion of a real-life practical application of comparative genomics algorithms may help spur algorithm developers to tackle some of the many remaining problems that need to be addressed. Solutions to these problems will advance a wide range of biological disciplines, only one of which is pathogen detection. For example, exploration in evolution and phylogenetics, annotating gene coding regions, predicting and understanding gene function and regulation, and untangling gene networks all rely on tools for aligning multiple sequences, detecting gene rearrangements and duplications, and visualising genomic data. Two key problems currently needing improved solutions are: (1) aligning incomplete, fragmentary sequence (eg draft genome contigs or arbitrary genome regions) with both complete genomes and other fragmentary sequences; and (2) ordering, aligning and visualising non-colinear gene rearrangements and inversions in addition to the colinear alignments handled by current tools.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Bacterial Proteins / metabolism
  • Base Sequence
  • Bioterrorism*
  • Genes, Bacterial
  • Genes, Viral
  • Genome
  • Genomics / methods*
  • Humans
  • Models, Molecular
  • Protein Structure, Tertiary
  • Sequence Alignment
  • Software
  • United States
  • Viral Proteins / metabolism


  • Bacterial Proteins
  • Viral Proteins