Whole-genome sequencing in French Canadians from Quebec

Hum Genet. 2016 Nov;135(11):1213-1221. doi: 10.1007/s00439-016-1702-6. Epub 2016 Jul 4.


Genome-wide association studies (GWAS) have had a tremendous success in the identification of common DNA sequence variants associated with complex human diseases and traits. However, because of their design, GWAS are largely inappropriate to characterize the role of rare and low-frequency DNA variants on human phenotypic variation. Rarer genetic variation is geographically more restricted, supporting the need for local whole-genome sequencing (WGS) efforts to study these variants in specific populations. Here, we present the first large-scale low-pass WGS of the French-Canadian population. Specifically, we sequenced at ~5.6× coverage the whole genome of 1970 French Canadians recruited by the Montreal Heart Institute Biobank and identified 29 million bi-allelic variants (31 % novel), including 19 million variants with a minor allele frequency (MAF) <0.5 %. Genotypes from the WGS data are highly concordant with genotypes obtained by exome array on the same individuals (99.8 %), even when restricting this analysis to rare variants (MAF <0.5, 99.9 %) or heterozygous sites (98.9 %). To further validate our data set, we showed that we can effectively use it to replicate several genetic associations with myocardial infarction risk and blood lipid levels. Furthermore, we analyze the utility of our WGS data set to generate a French-Canadian-specific imputation reference panel and to infer population structure in the Province of Quebec. Our results illustrate the value of low-pass WGS to study the genetics of human diseases in the founder French-Canadian population.

MeSH terms

  • Canada
  • Exome / genetics*
  • Gene Frequency
  • Genetic Diseases, Inborn / epidemiology
  • Genetic Diseases, Inborn / genetics*
  • Genetic Variation*
  • Genome, Human
  • Genotype
  • High-Throughput Nucleotide Sequencing*
  • Humans
  • Phenotype
  • Quebec