A genome-based approach for the identification of essential bacterial genes

Nat Biotechnol. 1998 Sep;16(9):851-6. doi: 10.1038/nbt0998-851.


We have used comparative genomics to identify 26 Escherichia coli open reading frames that are both of unknown function (hypothetical open reading frames or y-genes) and conserved in the compact genome of Mycoplasma genitalium. Not surprisingly, these genes are broadly conserved in the bacterial world. We used a markerless knockout strategy to screen for essential E. coli genes. To verify this phenotype, we constructed conditional mutants in genes for which no null mutants could be obtained. In total we identified six genes that are essential for E. coli (yhbZ, ygjD, ycfB, yfil, yihA, and yjeQ). The respective orthologs of the genes yhbZ, ygjD, ycfB, yjeQ, and yihA are also essential in Bacillus subtilis. This low number of essential genes was unexpected and might be due to a characteristic of the versatile genomes of E. coli and B. subtilis that is comparable to the phenomenon of nonorthologous gene displacement. The gene ygjD, encoding a sialoglycoprotease, was eliminated from a minimal genome computationally derived from a comparison of the Haemophilus influenzae and M. genitalium genomes. We show that ygjD and its ortholog ydiE are essential in E. coli and B. subtilis, respectively. Thus, we include this gene in a minimal genome. This study systematically integrates comparative genomics and targeted gene disruptions to identify broadly conserved bacterial genes of unknown function required for survival on complex media.

Publication types

  • Comparative Study

MeSH terms

  • Amino Acid Sequence
  • Bacillus subtilis / genetics
  • Base Sequence
  • DNA Primers
  • Escherichia coli / genetics*
  • Genome, Bacterial*
  • Molecular Sequence Data
  • Open Reading Frames
  • Saccharomyces cerevisiae / genetics
  • Sequence Homology, Amino Acid


  • DNA Primers