11,670 whole-genome sequences representative of the Han Chinese population from the CONVERGE project

Sci Data. 2017 Feb 14;4:170011. doi: 10.1038/sdata.2017.11.

Abstract

The China, Oxford and Virginia Commonwealth University Experimental Research on Genetic Epidemiology (CONVERGE) project on Major Depressive Disorder (MDD) sequenced 11,670 female Han Chinese at low-coverage (1.7X), providing the first large-scale whole genome sequencing resource representative of the largest ethnic group in the world. Samples are collected from 58 hospitals from 23 provinces around China. We are able to call 22 million high quality single nucleotide polymorphisms (SNP) from the nuclear genome, representing the largest SNP call set from an East Asian population to date. We use these variants for imputation of genotypes across all samples, and this has allowed us to perform a successful genome wide association study (GWAS) on MDD. The utility of these data can be extended to studies of genetic ancestry in the Han Chinese and evolutionary genetics when integrated with data from other populations. Molecular phenotypes, such as copy number variations and structural variations can be detected, quantified and analysed in similar ways.

Publication types

  • Dataset
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Retracted Publication

MeSH terms

  • Asian Continental Ancestry Group
  • China
  • DNA Copy Number Variations
  • Female
  • Genome, Human*
  • Genome-Wide Association Study
  • Humans