TS: a powerful truncated test to detect novel disease associated genes using publicly available gWAS summary data

BMC Bioinformatics. 2020 May 4;21(1):172. doi: 10.1186/s12859-020-3511-0.

Abstract

Background: In the last decade, a large number of common variants underlying complex diseases have been identified through genome-wide association studies (GWASs). Summary data of the GWASs are freely and publicly available. The summary data is usually obtained through single marker analysis. Gene-based analysis offers a useful alternative and complement to single marker analysis. Results from gene level association tests can be more readily integrated with downstream functional and pathogenic investigations. Most existing gene-based methods fall into two categories: burden tests and quadratic tests. Burden tests are usually powerful when the directions of effects of causal variants are the same. However, they may suffer loss of statistical power when different directions of effects exist at the causal variants. The power of quadratic tests is not affected by the directions of effects but could be less powerful due to issues such as the large number of degree of freedoms. These drawbacks of existing gene based methods motivated us to develop a new powerful method to identify disease associated genes using existing GWAS summary data.

Methods and results: In this paper, we propose a new truncated statistic method (TS) by utilizing a truncated method to find the genes that have a true contribution to the genetic association. Extensive simulation studies demonstrate that our proposed test outperforms other comparable tests. We applied TS and other comparable methods to the schizophrenia GWAS data and type 2 diabetes (T2D) GWAS meta-analysis summary data. TS identified more disease associated genes than comparable methods. Many of the significant genes identified by TS may have important mechanisms relevant to the associated traits. TS is implemented in C program TS, which is freely and publicly available online.

Conclusions: The proposed truncated statistic outperforms existing methods. It can be employed to detect novel traits associated genes using GWAS summary data.

Keywords: Burden tests; Genome-wide association studies (GWAS); Quadratic test methods; Truncated statistic method.

MeSH terms

  • Diabetes Mellitus, Type 2 / genetics*
  • Genome-Wide Association Study
  • Humans
  • Models, Statistical
  • Phenotype
  • Polymorphism, Single Nucleotide
  • Schizophrenia / genetics*