Objective: Genome-wide association studies have to date identified 159 significant and suggestive loci for coronary artery disease (CAD). We now report comprehensive bioinformatics analyses of sequence variation in these loci to predict candidate causal genes.
Approach and results: All annotated genes in the loci were evaluated with respect to protein-coding single-nucleotide polymorphism and gene expression parameters. The latter included expression quantitative trait loci, tissue specificity, and miRNA binding. High priority candidate genes were further identified based on literature searches and our experimental data. We conclude that the great majority of causal variations affecting CAD risk occur in noncoding regions, with 41% affecting gene expression robustly versus 6% leading to amino acid changes. Many of these genes differed from the traditionally annotated genes, which was usually based on proximity to the lead single-nucleotide polymorphism. Indeed, we obtained evidence that genetic variants at CAD loci affect 98 genes which had not been linked to CAD previously.
Conclusions: Our results substantially revise the list of likely candidates for CAD and suggest that genome-wide association studies efforts in other diseases may benefit from similar bioinformatics analyses.
Keywords: coronary artery disease; genome-wide association study; microRNAs; single-nucleotide polymorphism; systems biology.
© 2015 American Heart Association, Inc.