Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Mar;36(4):e26.
doi: 10.1093/nar/gkn007. Epub 2008 Feb 7.

Commonality of Functional Annotation: A Method for Prioritization of Candidate Genes From Genome-Wide Linkage Studies

Free PMC article

Commonality of Functional Annotation: A Method for Prioritization of Candidate Genes From Genome-Wide Linkage Studies

Daniel Shriner et al. Nucleic Acids Res. .
Free PMC article


Linkage studies of complex traits frequently yield multiple linkage regions covering hundreds of genes. Testing each candidate gene from every region is prohibitively expensive and computational methods that simplify this process would benefit genetic research. We present a new method based on commonality of functional annotation (CFA) that aids dissection of complex traits for which multiple causal genes act in a single pathway or process. CFA works by testing individual Gene Ontology (GO) terms for enrichment among candidate gene pools, performs multiple hypothesis testing adjustment using an estimate of independent tests based on correlation of GO terms, and then scores and ranks genes annotated with significantly-enriched terms based on the number of quantitative trait loci regions in which genes bearing those annotations appear. We evaluate CFA using simulated linkage data and show that CFA has good power despite being conservative. We apply CFA to published linkage studies investigating age-of-onset of Alzheimer's disease and body mass index and obtain previously known and new candidate genes. CFA provides a new tool for studies in which causal genes are expected to participate in a common pathway or process and can easily be extended to utilize annotation schemes in addition to the GO.


Figure 1.
Figure 1.
Workflow diagram. The flow of data from each step is schematically depicted. The genome-wide correlation matrix is computed for all GO terms and saved for subsequent analysis with different data sets. Genes overlapping with QTL regions and their associated GO annotations are obtained from the UCSC Genome Informatics DAS/1 server and the Entrez Gene database, respectively. For each data set, a study-specific correlation matrix is blocked from the genome-wide correlation matrix. Genes from each QTL are combined to form a study-wide gene list and each term is then tested for over-representation using Fisher's exact test. The effective number of independent tests is estimated using Velicer's minimum average partial test, and P-values obtained are adjusted upward based on the effective number of independent tests. Genes are then scored using weights computed from principal components of the study-specific correlation matrix and the number of QTL containing genes with enriched annotations. Rectangles indicate products of data processing and cylinders indicate databases.
Figure 2.
Figure 2.
Distributions of positive correlation coefficients. (A) Correlation coefficients >0.2 for term pairs in which both terms belong to the same sub-ontology. (B) Correlation coefficients >0.2 for term pairs in which the terms belong to different sub-ontologies.

Similar articles

See all similar articles

Cited by 14 articles

See all "Cited by" articles


    1. Lynch M, Walsh B. Genetics and Analysis of Quantitative Traits. Inc., Sunderland, MA: Sinauer Associates; 1998.
    1. Badano JL, Katsanis N. Beyond Mendel: an evolving view of human genetic disease transmission. Nat. Rev. Genet. 2002;3:779–789. - PubMed
    1. Tiffin N, Adie E, Turner F, Brunner HG, van Driel MA, Oti M, López-Bigas N, Ouzounis C, Perez-Iratxeta C, Andrade-Navarro MA, et al. Computational disease gene identification: a concert of methods prioritizes type 2 diabetes and obesity candidate genes. Nucleic Acids Res. 2006;34:3067–3081. - PMC - PubMed
    1. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. Nat. Genet. 2000;25:25–29. - PMC - PubMed
    1. Khatri P, Drăghici S. Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics. 2005;21:3587–3595. - PMC - PubMed

Publication types