Variance-components tests for genetic association with multiple interval-censored outcomes

Stat Med. 2024 Apr 18. doi: 10.1002/sim.10081. Online ahead of print.

Abstract

Massive genetic compendiums such as the UK Biobank have become an invaluable resource for identifying genetic variants that are associated with complex diseases. Due to the difficulties of massive data collection, a common practice of these compendiums is to collect interval-censored data. One challenge in analyzing such data is the lack of methodology available for genetic association studies with interval-censored data. Genetic effects are difficult to detect because of their rare and weak nature, and often the time-to-event outcomes are transformed to binary phenotypes for access to more powerful signal detection approaches. However transforming the data to binary outcomes can result in loss of valuable information. To alleviate such challenges, this work develops methodology to associate genetic variant sets with multiple interval-censored outcomes. Testing sets of variants such as genes or pathways is a common approach in genetic association settings to lower the multiple testing burden, aggregate small effects, and improve interpretations of results. Instead of performing inference with only a single outcome, utilizing multiple outcomes can increase statistical power by aggregating information across multiple correlated phenotypes. Simulations show that the proposed strategy can offer significant power gains over a single outcome approach. We apply the proposed test to the investigation that motivated this study, a search for the genes that perturb risks of bone fractures and falls in the UK Biobank.

Keywords: genome‐wide association studies; interval‐censored; multiple outcomes; set‐based inference; time‐to‐event.