Statistical mutation calling from sequenced overlapping DNA pools in TILLING experiments

BMC Bioinformatics. 2011 Jul 14:12:287. doi: 10.1186/1471-2105-12-287.

Abstract

Background: TILLING (Targeting induced local lesions IN genomes) is an efficient reverse genetics approach for detecting induced mutations in pools of individuals. Combined with the high-throughput of next-generation sequencing technologies, and the resolving power of overlapping pool design, TILLING provides an efficient and economical platform for functional genomics across thousands of organisms.

Results: We propose a probabilistic method for calling TILLING-induced mutations, and their carriers, from high throughput sequencing data of overlapping population pools, where each individual occurs in two pools. We assign a probability score to each sequence position by applying Bayes' Theorem to a simplified binomial model of sequencing error and expected mutations, taking into account the coverage level. We test the performance of our method on variable quality, high-throughput sequences from wheat and rice mutagenized populations.

Conclusions: We show that our method effectively discovers mutations in large populations with sensitivity of 92.5% and specificity of 99.8%. It also outperforms existing SNP detection methods in detecting real mutations, especially at higher levels of coverage variability across sequenced pools, and in lower quality short reads sequence data. The implementation of our method is available from: http://www.cs.ucdavis.edu/filkov/CAMBa/.

Publication types

  • Evaluation Study
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Bayes Theorem
  • DNA, Plant / genetics
  • Genome, Plant
  • Genomics / methods*
  • High-Throughput Nucleotide Sequencing
  • Mutagenesis*
  • Mutation
  • Oryza / genetics*
  • Sensitivity and Specificity
  • Sequence Analysis, DNA
  • Triticum / genetics

Substances

  • DNA, Plant