Whole genome sequencing

Methods Mol Biol. 2010;628:215-26. doi: 10.1007/978-1-60327-367-1_12.


Whole genome sequencing provides the most comprehensive collection of an individual's genetic variation. With the falling costs of sequencing technology, we envision paradigm shift from microarray-based genotyping studies to whole genome sequencing. We review methodologies for whole genome sequencing. There are two approaches for assembling short shotgun sequence reads into longer contiguous genomic sequences. In the de novo assembly approach, sequence reads are compared to each other, and then overlapped to build longer contiguous sequences. The reference-based assembly approach involves mapping each read to a reference genome sequence. We discuss methods for identifying genetic variation (single nucleotide polymorphisms, small indels, and copy number variants) and building haplotypes from genome assemblies, and discuss potential pitfalls. We expect methodologies to evolve rapidly as sequencing technologies improve and more human genomes are sequenced.

Publication types

  • Review

MeSH terms

  • Genetic Variation
  • Genome, Human*
  • Genomics
  • Humans
  • Sequence Analysis, DNA / methods*