The power of TOPMed imputation for the discovery of Latino-enriched rare variants associated with type 2 diabetes

Diabetologia. 2023 Jul;66(7):1273-1288. doi: 10.1007/s00125-023-05912-9. Epub 2023 May 6.


Aims/hypothesis: The Latino population has been systematically underrepresented in large-scale genetic analyses, and previous studies have relied on the imputation of ungenotyped variants based on the 1000 Genomes (1000G) imputation panel, which results in suboptimal capture of low-frequency or Latino-enriched variants. The National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) released the largest multi-ancestry genotype reference panel representing a unique opportunity to analyse rare genetic variations in the Latino population. We hypothesise that a more comprehensive analysis of low/rare variation using the TOPMed panel would improve our knowledge of the genetics of type 2 diabetes in the Latino population.

Methods: We evaluated the TOPMed imputation performance using genotyping array and whole-exome sequence data in six Latino cohorts. To evaluate the ability of TOPMed imputation to increase the number of identified loci, we performed a Latino type 2 diabetes genome-wide association study (GWAS) meta-analysis in 8150 individuals with type 2 diabetes and 10,735 control individuals and replicated the results in six additional cohorts including whole-genome sequence data from the All of Us cohort.

Results: Compared with imputation with 1000G, the TOPMed panel improved the identification of rare and low-frequency variants. We identified 26 genome-wide significant signals including a novel variant (minor allele frequency 1.7%; OR 1.37, p=3.4 × 10-9). A Latino-tailored polygenic score constructed from our data and GWAS data from East Asian and European populations improved the prediction accuracy in a Latino target dataset, explaining up to 7.6% of the type 2 diabetes risk variance.

Conclusions/interpretation: Our results demonstrate the utility of TOPMed imputation for identifying low-frequency variants in understudied populations, leading to the discovery of novel disease associations and the improvement of polygenic scores.

Data availability: Full summary statistics are available through the Common Metabolic Diseases Knowledge Portal ( ) and through the GWAS catalog ( , accession ID: GCST90255648). Polygenic score (PS) weights for each ancestry are available via the PGS catalog ( , publication ID: PGP000445, scores IDs: PGS003443, PGS003444 and PGS003445).

Keywords: GWAS meta-analysis; Latino population; Polygenic score; TOPMed imputation; Type 2 diabetes.

Publication types

  • Meta-Analysis
  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Diabetes Mellitus, Type 2* / genetics
  • Genome-Wide Association Study
  • Genotype
  • Hispanic or Latino / genetics
  • Humans
  • Polymorphism, Single Nucleotide / genetics
  • Population Health*
  • Precision Medicine

Grants and funding