Detecting anomalous gene clusters and pathogenicity islands in diverse bacterial genomes

Trends Microbiol. 2001 Jul;9(7):335-43. doi: 10.1016/s0966-842x(01)02079-0.


A gene in a genome is defined as putative alien (pA) if its codon usage difference from the average gene exceeds a high threshold and codon usage differences from ribosomal protein genes, chaperone genes and protein-synthesis-processing factors are also high. pA gene clusters in bacterial genomes are relevant for detecting genomic islands (GIs), including pathogenicity islands (PAIs). Four other analyses appropriate to this task are G+C genome variation (the standard method); genomic signature divergences (dinucleotide bias); extremes of codon bias; and anomalies of amino acid usage. For example, the cagA domain of Helicobacter pylori is highly deviant in its genome signature and codon bias from the rest of the genome. Using these methods we can detect two potential PAIs in the Neisseria meningitidis genome, which contain hemagglutinin and/or hemolysin-related genes. Additionally, G+C variation and genome signature differences of the Mycobacterium tuberculosis genome indicate two pA gene clusters.

Publication types

  • Research Support, U.S. Gov't, P.H.S.
  • Review

MeSH terms

  • Bacteria / genetics*
  • Bacteria / pathogenicity
  • Base Sequence
  • DNA, Bacterial
  • Genome, Bacterial*
  • Multigene Family*
  • Sequence Analysis, DNA


  • DNA, Bacterial