Fast and flexible simulation of DNA sequence data

Genome Res. 2009 Jan;19(1):136-42. doi: 10.1101/gr.083634.108. Epub 2008 Nov 24.

Abstract

Simulation of genomic sequences under the coalescent with recombination has conventionally been impractical for regions beyond tens of megabases. This work presents an algorithm, implemented as the program MaCS (Markovian Coalescent Simulator), that can efficiently simulate haplotypes under any arbitrary model of population history. We present several metrics comparing the performance of MaCS with other available simulation programs. Practical usage of MaCS is demonstrated through a comparison of measures of linkage disequilibrium between generated program output and real genotype data from populations considered to be structured.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Computer Simulation
  • DNA / genetics*
  • Genome-Wide Association Study / statistics & numerical data
  • Genomics / statistics & numerical data
  • Humans
  • Linkage Disequilibrium
  • Markov Chains
  • Polymorphism, Single Nucleotide
  • Recombination, Genetic
  • Sequence Analysis, DNA / statistics & numerical data*
  • Software

Substances

  • DNA