Mining tissue specificity, gene connectivity and disease association to reveal a set of genes that modify the action of disease causing genes

BioData Min. 2008 Sep 19;1(1):8. doi: 10.1186/1756-0381-1-8.


Background: The tissue specificity of gene expression has been linked to a number of significant outcomes including level of expression, and differential rates of polymorphism, evolution and disease association. Recent studies have also shown the importance of exploring differential gene connectivity and sequence conservation in the identification of disease-associated genes. However, no study relates gene interactions with tissue specificity and disease association.

Methods: We adopted an a priori approach making as few assumptions as possible to analyse the interplay among gene-gene interactions with tissue specificity and its subsequent likelihood of association with disease. We mined three large datasets comprising expression data drawn from massively parallel signature sequencing across 32 tissues, describing a set of 55,606 true positive interactions for 7,197 genes, and microarray expression results generated during the profiling of systemic inflammation, from which 126,543 interactions among 7,090 genes were reported.

Results: Amongst the myriad of complex relationships identified between expression, disease, connectivity and tissue specificity, some interesting patterns emerged. These include elevated rates of expression and network connectivity in housekeeping and disease-associated tissue-specific genes. We found that disease-associated genes are more likely to show tissue specific expression and most frequently interact with other disease genes. Using the thresholds defined in these observations, we develop a guilt-by-association algorithm and discover a group of 112 non-disease annotated genes that predominantly interact with disease-associated genes, impacting on disease outcomes.

Conclusion: We conclude that parameters such as tissue specificity and network connectivity can be used in combination to identify a group of genes, not previously confirmed as disease causing, that are involved in interactions with disease causing genes. Our guilt-by-association algorithm should be useful for the discovery of additional modifiers of genetic diseases, and more generally, for the ability to associate genes of unknown function to clusters of genes with defined functions allowing for novel biological inference that can be subsequently validated.