A fast and flexible approach to oligonucleotide probe design for genomes and gene families

Bioinformatics. 2007 May 15;23(10):1195-202. doi: 10.1093/bioinformatics/btm114. Epub 2007 Mar 28.

Abstract

Motivation: With hundreds of completely sequenced microbial genomes available, and advancements in DNA microarray technology, the detection of genes in microbial communities consisting of hundreds of thousands of sequences may be possible. The existing strategies developed for DNA probe design, geared toward identifying specific sequences, are not suitable due to the lack of coverage, flexibility and efficiency necessary for applications in metagenomics.

Methods: ProDesign is a tool developed for the selection of oligonucleotide probes to detect members of gene families present in environmental samples. Gene family-specific probe sequences are generated based on specific and shared words, which are found with the spaced seed hashing algorithm. To detect more sequences, those sharing some common words are re-clustered into new families, then probes specific for the new families are generated.

Results: The program is very flexible in that it can be used for designing probes for detecting many genes families simultaneously and specifically in one or more genomes. Neither the length nor the melting temperature of the probes needs to be predefined. We have found that ProDesign provides more flexibility, coverage and speed than other software programs used in the selection of probes for genomic and gene family arrays.

Availability: ProDesign is licensed free of charge to academic users. ProDesign and Supplementary Material can be obtained by contacting the authors. A web server for ProDesign is available at http://www.uhnresearch.ca/labs/tillier/ProDesign/ProDesign.html.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Bacteria / genetics
  • Computational Biology / methods*
  • Genome, Bacterial
  • Microarray Analysis
  • Multigene Family*
  • Oligonucleotide Array Sequence Analysis
  • Oligonucleotide Probes / genetics*
  • Software

Substances

  • Oligonucleotide Probes