A standard variation file format for human genome sequences

Genome Biol. 2010;11(8):R88. doi: 10.1186/gb-2010-11-8-r88. Epub 2010 Aug 26.


Here we describe the Genome Variation Format (GVF) and the 10Gen dataset. GVF, an extension of Generic Feature Format version 3 (GFF3), is a simple tab-delimited format for DNA variant files, which uses Sequence Ontology to describe genome variation data. The 10Gen dataset, ten human genomes in GVF format, is freely available for community analysis from the Sequence Ontology website and from an Amazon elastic block storage (EBS) snapshot for use in Amazon's EC2 cloud computing environment.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Base Sequence
  • Databases, Nucleic Acid*
  • Genetic Variation
  • Genome, Human / genetics*
  • Humans
  • Information Storage and Retrieval*
  • Internet