Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Sep;118:613-27.
doi: 10.1016/j.neuroimage.2015.05.043. Epub 2015 May 27.

FVGWAS: Fast Voxelwise Genome Wide Association Analysis of Large-Scale Imaging Genetic Data

Affiliations
Free PMC article

FVGWAS: Fast Voxelwise Genome Wide Association Analysis of Large-Scale Imaging Genetic Data

Meiyan Huang et al. Neuroimage. .
Free PMC article

Abstract

More and more large-scale imaging genetic studies are being widely conducted to collect a rich set of imaging, genetic, and clinical data to detect putative genes for complexly inherited neuropsychiatric and neurodegenerative disorders. Several major big-data challenges arise from testing genome-wide (NC>12 million known variants) associations with signals at millions of locations (NV~10(6)) in the brain from thousands of subjects (n~10(3)). The aim of this paper is to develop a Fast Voxelwise Genome Wide Association analysiS (FVGWAS) framework to efficiently carry out whole-genome analyses of whole-brain data. FVGWAS consists of three components including a heteroscedastic linear model, a global sure independence screening (GSIS) procedure, and a detection procedure based on wild bootstrap methods. Specifically, for standard linear association, the computational complexity is O (nNVNC) for voxelwise genome wide association analysis (VGWAS) method compared with O ((NC+NV)n(2)) for FVGWAS. Simulation studies show that FVGWAS is an efficient method of searching sparse signals in an extremely large search space, while controlling for the family-wise error rate. Finally, we have successfully applied FVGWAS to a large-scale imaging genetic data analysis of ADNI data with 708 subjects, 193,275voxels in RAVENS maps, and 501,584 SNPs, and the total processing time was 203,645s for a single CPU. Our FVGWAS may be a valuable statistical toolbox for large-scale imaging genetic analysis as the field is rapidly advancing with ultra-high-resolution imaging and whole-genome sequencing.

Keywords: Computational complexity; Family-wise error rate; Heteroscedastic linear model; Voxelwise genome wide association; Wild bootstrap.

Figures

Fig. 1
Fig. 1
Schematic overview of FVGWAS
Fig. 2
Fig. 2
Simulation settings: the dark, gray, and white regions in each panel, respectively, represent background, brain region, and the effected ROI associated with the causal SNPs. From the left to the right, the sizes of the effected ROI are, respectively, set as 5 × 5, 10 × 10, and 20 × 20.
Fig. 3
Fig. 3
Simulation results for the association between SNPs and voxels: the first row contains ROC curves with varying γ values (corresponding to the causal SNPs’ effect magnitude) and the number of the top N0 SNPs included in the selection procedure. Parameters r, σ2, and n are set to 10, 1, and 1000, respectively. The second row contains ROC curves with different ROIs. Parameters γ, σ2, and n are set to 0.01, 1, and 1000, respectively. The third row contains ROC curves with varying σ. Parameters γ, r, and n are set to 0.01, 10, and 1000, respectively. The fourth row contains ROC curves with varying n. Parameters γ, σ2, and r are set to 0.01, 1, and 10, respectively.
Fig. 4
Fig. 4
Simulation results for the association between SNPs and clusters: (a) the size in the number of pixels of false positive clusters in each causal SNP; (b) number of false positive clusters in each causal SNP; and (c) dice overlap ratio (DOR) in each causal SNP. Parameters γ, σ2, n, and r are set to 0.01, 1, 1000, and 10, respectively.
Fig. 5
Fig. 5
Simulation results for comparisons between FVGWAS and the Matrix eQTL in identifying significant voxel-locus pairs: ROC curves of the proposed method with N0 = 100, 500, and 1, 000, and the Matrix eQTL method at γ = 0.005 and γ = 0.01. Parameters σ2, n, and r are set to 1, 1000, and 10, respectively.
Fig. 6
Fig. 6
ADNI ROI volume GWAS: (a) Manhattan plot; (b) QQ plot; and the numbers of significant SNP-ROI pairs based on the corrected p–values of W(c, v) at the 0.5 significance level corresponding to the top (c) N0 = 1, 000 and (d) N0 = 2, 000 SNPs;
Fig. 7
Fig. 7
ADNI whole-brain GWAS: (a) Manhattan plot; (b) QQ plot; the numbers of significant voxel-locus pairs based on the raw p–values of W(c, v) at the 10−5 significance level corresponding to the top (c) N0 = 1, 000 and (d) N0 = 2, 000 SNPs; the numbers of significant voxel-locus pairs based on the corrected p–values of W(c, v) at the (e) 0.5 or (f) 0.8 significance level corresponding to the top N0 = 1, 000 SNPs.
Fig. 8
Fig. 8
ADNI whole-brain GWAS: (a) the density plot of W(c) and its χ2 approximation; (b) the density plot of WV,C~0 and its χ2 approximation; the density plots of WV,C~0 for N0 = 1, 000 (c) and N0 = 2, 000 (d).
Fig. 9
Fig. 9
ADNI whole-brain GWAS: density plots of N(C~0,αI=0.005) for N0 = 1, 000 (a) and N0 = 2, 000 (b); the numbers of significant voxel-locus pairs based on the corrected p–values of W(c, v) at the 0.5 significance level corresponding to the top N0 = 1, 000 (c) and N0 = 2, 000 (d) SNPs.
Fig. 10
Fig. 10
ADNI whole-brain GWAS: selected slices of – log10(p) for significant clusters corresponding to a SNP (rs2480271).
Fig. 11
Fig. 11
ADNI FVGWAS versus VGWAS: (a) raw – log10(p)-values of all selected voxel-locus pairs corresponding to our method and the standard t test; (b) number of significant voxel-locus pairs at the 0.5 significant level; (c) maximum cluster sizes of all selected SNPs obtained from the Matrix eQTL results; (d) the p–values of the maximum clusters corresponding to all selected SNPs.

Similar articles

See all similar articles

Cited by 19 articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback