Analysis of population-specific pharmacogenomic variants using next-generation sequencing data

Sci Rep. 2017 Sep 4;7(1):8416. doi: 10.1038/s41598-017-08468-y.


Functional rare variants in drug-related genes are believed to be highly differentiated between ethnic- or racial populations. However, knowledge of population differentiation (PD) of rare single-nucleotide variants (SNVs), remains widely lacking, with the highest fixation indices, (Fst values), from both rare and common variants annotated to specific genes, having only been marginally used to understand PD at the gene level. In this study, we suggest a new, gene-based PD method, PD of Rare and Common variants (PDRC), for analyzing rare variants, as inspired by Generalized Cochran-Mantel-Haenszel (GCMH) statistics, to identify highly population-differentiated drug response-related genes ("pharmacogenes"). Through simulation studies, we reveal that PDRC adequately summarizes rare and common variants, due to PD, over a specific gene. We also applied the proposed method to a real whole-exome sequencing dataset, consisting of 10,000 datasets, from the Type 2 Diabetes Genetic Exploration by Next-generation sequencing in multi-Ethnic Samples (T2D-GENES) initiative, and 3,000 datasets from the Genetics of Type 2 diabetes (Go-T2D) repository. Among the 48 genes annotated with Very Important Pharmacogenetic summaries (VIPgenes), in the PharmGKB database, our PD method successfully identified candidate genes with high PD, including ACE, CYP2B6, DPYD, F5, MTHFR, and SCN5A.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computer Simulation
  • Databases, Genetic
  • Diabetes Mellitus, Type 2 / genetics
  • Gene Frequency
  • Genetic Predisposition to Disease / genetics
  • Genetics, Population / methods*
  • Genetics, Population / statistics & numerical data
  • Genotype
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Models, Genetic
  • Pharmacogenetics / methods*
  • Pharmacogenetics / statistics & numerical data
  • Pharmacogenomic Variants / genetics*
  • Polymorphism, Single Nucleotide