Sequence variants from whole genome sequencing a large group of Icelanders

Sci Data. 2015 Mar 25;2:150011. doi: 10.1038/sdata.2015.11. eCollection 2015.

Abstract

We have accumulated considerable data on the genetic makeup of the Icelandic population by sequencing the whole genomes of 2,636 Icelanders to depth of at least 10X and by chip genotyping 101,584 more. The sequencing was done with Illumina technology. The median sequencing depth was 20X and 909 individuals were sequenced to a depth of at least 30X. We found 20 million single nucleotide polymorphisms (SNPs) and 1.5 million insertions/deletions (indels) that passed stringent quality control. Almost all the common SNPs (derived allele frequency (DAF) over 2%) that we identified in Iceland have been observed by either dbSNP (build 137) or the Exome Sequencing Project (ESP) while only 60 and 20% of rare (DAF<0.5%) SNPs and indels in coding regions, the most heavily studied parts of the genome, have been observed in the public databases. Features of our variant data, such as the transition/transversion ratio and the length distribution of indels, are similar to published reports.

Publication types

  • Dataset
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Gene Frequency
  • Genome, Human*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • INDEL Mutation
  • Iceland
  • Polymorphism, Single Nucleotide
  • Sequence Analysis, DNA*