Operating characteristics of the rank-based inverse normal transformation for quantitative trait analysis in genome-wide association studies
- PMID: 31883270
- PMCID: PMC8643141
- DOI: 10.1111/biom.13214
Operating characteristics of the rank-based inverse normal transformation for quantitative trait analysis in genome-wide association studies
Abstract
Quantitative traits analyzed in Genome-Wide Association Studies (GWAS) are often nonnormally distributed. For such traits, association tests based on standard linear regression are subject to reduced power and inflated type I error in finite samples. Applying the rank-based inverse normal transformation (INT) to nonnormally distributed traits has become common practice in GWAS. However, the different variations on INT-based association testing have not been formally defined, and guidance is lacking on when to use which approach. In this paper, we formally define and systematically compare the direct (D-INT) and indirect (I-INT) INT-based association tests. We discuss their assumptions, underlying generative models, and connections. We demonstrate that the relative powers of D-INT and I-INT depend on the underlying data generating process. Since neither approach is uniformly most powerful, we combine them into an adaptive omnibus test (O-INT). O-INT is robust to model misspecification, protects the type I error, and is well powered against a wide range of nonnormally distributed traits. Extensive simulations were conducted to examine the finite sample operating characteristics of these tests. Our results demonstrate that, for nonnormally distributed traits, INT-based tests outperform the standard untransformed association test, both in terms of power and type I error rate control. We apply the proposed methods to GWAS of spirometry traits in the UK Biobank. O-INT has been implemented in the R package RNOmni, which is available on CRAN.
Keywords: direct and indirect rank-based inverse normal transformation; nonnormality; omnibus test; quantitative traits; transformation; type I error rate.
© 2019 The International Biometric Society.
Figures
Similar articles
-
A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies.PLoS One. 2020 Jun 19;15(6):e0233847. doi: 10.1371/journal.pone.0233847. eCollection 2020. PLoS One. 2020. PMID: 32559184 Free PMC article.
-
Integrate multiple traits to detect novel trait-gene association using GWAS summary data with an adaptive test approach.Bioinformatics. 2019 Jul 1;35(13):2251-2257. doi: 10.1093/bioinformatics/bty961. Bioinformatics. 2019. PMID: 30476000 Free PMC article.
-
The effect of phenotypic outliers and non-normality on rare-variant association testing.Eur J Hum Genet. 2016 Aug;24(8):1188-94. doi: 10.1038/ejhg.2015.270. Epub 2016 Jan 6. Eur J Hum Genet. 2016. PMID: 26733287 Free PMC article.
-
Testing the significance of a correlation with nonnormal data: comparison of Pearson, Spearman, transformation, and resampling approaches.Psychol Methods. 2012 Sep;17(3):399-417. doi: 10.1037/a0028087. Epub 2012 May 7. Psychol Methods. 2012. PMID: 22563845 Review.
-
Single Marker Family-Based Association Analysis Not Conditional on Parental Information.Methods Mol Biol. 2017;1666:409-439. doi: 10.1007/978-1-4939-7274-6_20. Methods Mol Biol. 2017. PMID: 28980257 Review.
Cited by
-
Cross-population enhancement of PrediXcan predictions with a gnomAD-based east Asian reference framework.Brief Bioinform. 2024 Sep 23;25(6):bbae549. doi: 10.1093/bib/bbae549. Brief Bioinform. 2024. PMID: 39441246 Free PMC article.
-
Development and Validation of a Protein Risk Score for Mortality in Heart Failure : A Community Cohort Study.Ann Intern Med. 2024 Jan;177(1):39-49. doi: 10.7326/M23-2328. Epub 2024 Jan 2. Ann Intern Med. 2024. PMID: 38163367 Free PMC article.
-
Epigenetic and Genetic Population Structure is Coupled in a Marine Invertebrate.Genome Biol Evol. 2023 Feb 3;15(2):evad013. doi: 10.1093/gbe/evad013. Genome Biol Evol. 2023. PMID: 36740242 Free PMC article.
-
BEXCIS: Bayesian methods for estimating the degree of the skewness of X chromosome inactivation.BMC Bioinformatics. 2022 May 24;23(1):193. doi: 10.1186/s12859-022-04721-y. BMC Bioinformatics. 2022. PMID: 35610583 Free PMC article.
-
Genome-Wide Epistatic Network Analyses of Semantic Fluency in Older Adults.Int J Mol Sci. 2024 May 11;25(10):5257. doi: 10.3390/ijms25105257. Int J Mol Sci. 2024. PMID: 38791296 Free PMC article.
References
-
- Abbott L, Bryant S, Churchhouse C, Ganna A, Howrigan D, Palmer D, Neale B, Walters R, Carey C for The Hail team. (2017) UK Biobank GWAS results, https://www.nealelab.is/uk-biobank (Accessed 2 January 2019).
-
- Bulik-Sullivan BK, Loh PR, Finucane HK, Ripke S, Yang J, Schizophrenia Working Group of the Psychiatric Genomics Consortium, Patterson N, Daly MJ, Price AL and Neale BM (2015) LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nature Genetics, 47, 291–295. - PMC - PubMed
Publication types
MeSH terms
Grants and funding
- R35 CA197449/CA/NCI NIH HHS/United States
- F31 HL140822/NH/NIH HHS/United States
- U19 CA203654/CA/NCI NIH HHS/United States
- R01 HL113338/HL/NHLBI NIH HHS/United States
- R35 HL135818/HL/NHLBI NIH HHS/United States
- F31 HL140822/HL/NHLBI NIH HHS/United States
- MC_QA137853/MRC_/Medical Research Council/United Kingdom
- R35 HL135818/NH/NIH HHS/United States
- R35 CA197449/NH/NIH HHS/United States
- MC_PC_12028/MRC_/Medical Research Council/United Kingdom
- P01 CA134294/CA/NCI NIH HHS/United States
- MC_PC_17228/MRC_/Medical Research Council/United Kingdom
- U01 HG009088/HG/NHGRI NIH HHS/United States
LinkOut - more resources
Full Text Sources
