Absent sequences: nullomers and primes

Pac Symp Biocomput. 2007:355-66. doi: 10.1142/9789812772435_0034.

Abstract

We describe a new publicly available algorithm for identifying absent sequences, and demonstrate its use by listing the smallest oligomers not found in the human genome (human "nullomers"), and those not found in any reported genome or GenBank sequence ("primes"). These absent sequences define the maximum set of potentially lethal oligomers. They also provide a rational basis for choosing artificial DNA sequences for molecular barcodes, show promise for species identification and environmental characterization based on absence, and identify potential targets for therapeutic intervention and suicide markers.

MeSH terms

  • Algorithms*
  • Base Sequence
  • Computational Biology
  • DNA / genetics
  • Databases, Genetic*
  • Genome, Human
  • Humans
  • Sequence Analysis / statistics & numerical data
  • Sequence Analysis, DNA / statistics & numerical data*
  • Software

Substances

  • DNA