Whole genome sequence analysis of low-density lipoprotein cholesterol across 246 K individuals

Genome Biol. 2025 Sep 9;26(1):273. doi: 10.1186/s13059-025-03698-0.

Abstract

Background: Rare genetic variation provided by whole genome sequence datasets has been relatively less explored for its contributions to human traits. Meta-analysis of sequencing data offers advantages by integrating larger sample sizes from diverse cohorts, thereby increasing the likelihood of discovering novel insights into complex traits. Furthermore, emerging methods in genome-wide rare variant association testing further improve power and interpretability.

Results: Here, we conduct the largest meta-analysis of whole genome sequencing for low-density lipoprotein cholesterol (LDL-C), a therapeutic target for coronary artery disease, analyzing data from 246 K participants and integrating 1.23B variants from the UK Biobank and the Trans-Omics for Precision Medicine (TOPMed) program. We identify numerous rare coding and non-coding gene associations related to LDL-C, with replication across 86 K participants in All of Us. Our findings are based on single-variant analyses, rare coding and non-coding variant aggregation tests, and sliding window approaches. Through this comprehensive analysis, we identify 704 novel single-variant associations, 25 novel rare coding variant aggregates, 28 novel rare non-coding variant aggregates, and one novel sliding window aggregate.

Conclusions: This study provides a meta-analysis framework for large-scale whole genome sequence association analyses from diverse population groups, yielding novel rare non-coding variant associations.

Publication types

  • Meta-Analysis

MeSH terms

  • Cholesterol, LDL* / blood
  • Cholesterol, LDL* / genetics
  • Genetic Variation
  • Genome, Human
  • Genome-Wide Association Study
  • Humans
  • Polymorphism, Single Nucleotide
  • Whole Genome Sequencing*

Substances

  • Cholesterol, LDL