Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Feb 13;11(2):e1004967.
doi: 10.1371/journal.pgen.1004967. eCollection 2015 Feb.

Evolutionary signatures amongst disease genes permit novel methods for gene prioritization and construction of informative gene-based networks

Affiliations

Evolutionary signatures amongst disease genes permit novel methods for gene prioritization and construction of informative gene-based networks

Nolan Priedigkeit et al. PLoS Genet. .

Abstract

Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC), is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting "disease map" network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. ERC values between complement deficiency genes.
A) Complement genes C1S and CFI show variation in their evolutionary rates between branches of the mammalian phylogeny. Branches are color-coded according to rate. (Red is for rapid evolution, blue for slow, and intermediate shades for rates in between.) Tree topology and distances between species are the same for each gene. B) The same evolutionary rates for C1S and CFI are plotted against each other. Their correlation is apparent here in the best-fit line and correlation coefficient of 0.806. C) This matrix contains all pairwise ERC values between the OMIM genes for complement deficiency. Cells are shaded red according to the intensity of their departure from the null expectation. Blue arrows indicate the genes C1S and CFI. It is notable that most values are positive, whereas a random collection of genes would contain equal proportions of positive and negative values. There are also many clusters of functionally related complement proteins that contain very strong signals of ERC. The C1-related proteins in the upper left corner are a prime example of such an ERC hotspot.
Figure 2
Figure 2. Disease gene groupings P-value distribution.
P-values represent the significance of elevated mean ERC within a particular disease. There is a notable excess of low p-values, indicating a large number of diseases with an ERC signature between their genes. False discovery rate analyses show that approximately 55% of disease states interrogated have significantly elevated ERC values.
Figure 3
Figure 3. ERC disease gene prioritization.
The prioritization of the true disease gene relative to its chromosomal neighbors improves with a stronger ERC signal within the training set. A low p-value (x-axis) indicates strong ERC within a training set. Prioritization (y-axis) is presented as the proportion of candidate genes scoring lower than the true disease gene, i.e. higher represents better prioritization. The blue series is for diseases with training sets with 20 or fewer genes, representing the majority (70%) of OMIM diseases interrogated. The dotted green line is for those diseases with larger training sets.
Figure 4
Figure 4. Evolution-based disease map.
ERC signatures between diseases were used to draw connections between separate diseases at a false discovery rate of 5%. The 12 clusters represent diseases that involve common genetic mechanisms as inferred by ERC. The largest cluster (pink network) contains several blood-related pathologies, while the light blue network contains mitochondrial diseases and ciliopathies. The remaining clusters contain many novel disease-disease relationships and are addressed fully in the Discussion section.

Similar articles

Cited by

References

    1. Manolio TA (2013) Bringing genome-wide association findings into clinical use. Nat Rev Genet 14: 549–558. 10.1038/nrg3523 - DOI - PubMed
    1. Steensma DP (2013) The beginning of the end of the beginning in cancer genomics. N Engl J Med 368: 2138–2140. 10.1056/NEJMe1303816 - DOI - PubMed
    1. Barabási A-L, Gulbahce N, Loscalzo J (2011) Network medicine: a network-based approach to human disease. Nat Rev Genet 12: 56–68. 10.1038/nrg2918 - DOI - PMC - PubMed
    1. Blair DR, Lyttle CS, Mortensen JM, Bearden CF, Jensen AB, et al. (2013) A nondegenerate code of deleterious variants in Mendelian loci contributes to complex disease risk. Cell 155: 70–80. 10.1016/j.cell.2013.08.030 - DOI - PMC - PubMed
    1. Warde-Farley D, Donaldson SL, Comes O, Zuberi K, Badrawi R, et al. (2010) The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res 38: W214–W220. 10.1093/nar/gkq537 - DOI - PMC - PubMed

Publication types

Substances