Accounting for bias from sequencing error in population genetic estimates

Mol Biol Evol. 2008 Jan;25(1):199-206. doi: 10.1093/molbev/msm239. Epub 2007 Nov 2.

Abstract

Sequencing error presents a significant challenge to population genetic analyses using low-coverage sequence in general and single-pass reads in particular. Bias in parameter estimates becomes severe when the level of polymorphism (signal) is low relative to the amount of error (noise). Choosing an arbitrary quality score cutoff yields biased estimates, particularly with newer, non-Sanger sequencing technologies that have different quality score distributions. We propose a rule of thumb to judge when a given threshold will lead to significant bias and suggest alternative approaches that reduce bias.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Genetics, Population*
  • Polymorphism, Genetic*
  • Reproducibility of Results
  • Selection Bias
  • Sequence Analysis, DNA*