Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2015 Feb 15;31(4):471-83.
doi: 10.1093/bioinformatics/btu611. Epub 2014 Sep 17.

Understanding the Limits of Animal Models as Predictors of Human Biology: Lessons Learned From the Sbv IMPROVER Species Translation Challenge

Affiliations
Free PMC article
Comparative Study

Understanding the Limits of Animal Models as Predictors of Human Biology: Lessons Learned From the Sbv IMPROVER Species Translation Challenge

Kahn Rhrissorrakrai et al. Bioinformatics. .
Free PMC article

Abstract

Motivation: Inferring how humans respond to external cues such as drugs, chemicals, viruses or hormones is an essential question in biomedicine. Very often, however, this question cannot be addressed because it is not possible to perform experiments in humans. A reasonable alternative consists of generating responses in animal models and 'translating' those results to humans. The limitations of such translation, however, are far from clear, and systematic assessments of its actual potential are urgently needed. sbv IMPROVER (systems biology verification for Industrial Methodology for PROcess VErification in Research) was designed as a series of challenges to address translatability between humans and rodents. This collaborative crowd-sourcing initiative invited scientists from around the world to apply their own computational methodologies on a multilayer systems biology dataset composed of phosphoproteomics, transcriptomics and cytokine data derived from normal human and rat bronchial epithelial cells exposed in parallel to 52 different stimuli under identical conditions. Our aim was to understand the limits of species-to-species translatability at different levels of biological organization: signaling, transcriptional and release of secreted factors (such as cytokines). Participating teams submitted 49 different solutions across the sub-challenges, two-thirds of which were statistically significantly better than random. Additionally, similar computational methods were found to range widely in their performance within the same challenge, and no single method emerged as a clear winner across all sub-challenges. Finally, computational methods were able to effectively translate some specific stimuli and biological processes in the lung epithelial system, such as DNA synthesis, cytoskeleton and extracellular matrix, translation, immune/inflammation and growth factor/proliferation pathways, better than the expected response similarity between species.

Contact: pmeyerr@us.ibm.com or Julia.Hoeng@pmi.com

Supplementary information: Supplementary data are available at Bioinformatics online.

Figures

Fig. 1.
Fig. 1.
Overview of the STC: (A) Schematic of predictions to be made for each sub-challenge. Each sub-challenge required the prediction of the different sets of responses, indicated in red. (B) Schematic of SC4 to indicate utilization of a provided reference network with species-specific information from the training dataset to generate species-specific networks through the addition and removal of edges. Though cytokine measurements were made available to participants, they were not used in scoring, and for simplicity, were not included in this overview figure
Fig. 2.
Fig. 2.
Scores and computational methods used for solving the STC. The null hypothesis simulation was used to compute and plot team Z-scores of AUPR curve, balance accuracy (BAC) and PCC for SC1 (A), SC2 (B) and SC3 (C). Z-scores are used to compare the apparent difficulty of each of the sub-challenges. Panels (C–G) reflect actual performance differences—as measured by overall rank of three metrics—for different methodological approaches. Teams’ rank distributions are plotted separately by the type of approach for SC1 (D), SC2 (E) and SC3 (F). (G) In SC2, teams’ rank distribution is separated by usage of solely protein phosphorylation data or in combination with gene expression data. SVM: support vector machines, Trees: random forest and other tree-based methods, NN: neural networks, GA: genetic algorithm
Fig. 3.
Fig. 3.
Predictability versus species similarity for stimuli. (A) The y-axis indicates for each stimulus the mean predictability Prs of all team predictions when considering gene set activation in SC3. The x-axis is species similarity Ss of gene set activation. In red are stimuli where Prs > Ss > 0. (B) The y-axis indicates for each stimulus the mean predictability Prs of all team predictions when considering protein phosphorylation activation in SC2. The x-axis is Sp of phosphoprotein activation. In red are stimuli where Prs > Ss > 0. (C, D) Plots showing the percentage of teams where Prs > Ss for each stimulus when predicting gene set activation (C) or phosphoprotein activation (D). Stimuli are ordered by percentage of teams and the number of activated gene sets or phosphorylated proteins is indicated on top of each stimulus. The number of active calls per gene set is shown on the top of the graph. Nineteen stimuli are not shown in (B) and (D) because no proteins were measured as phosphorylated
Fig. 4.
Fig. 4.
Predictability versus species similarity for gene sets and phosphoproteins. (A) The y-axis indicates for each gene set the mean Prg of all team predictions when considering response to 26 stimuli in SC3. The x-axis is Sg of gene set activation. In red are stimuli where Prg > Sg > 0. (B) The y-axis indicates for each protein the mean Prp of all team predictions when considering response to 26 stimuli in SC2. The x-axis is Sp for phosphoprotein activation. (C and D) Plots showing the percentage of teams where Prg > Sg (C) and Prp > Sp gene sets and phosphoproteins are ordered by number of active calls, indicated on top of each black dot
Fig. 5.
Fig. 5.
Best translated gene sets representative of different pathways. (A) Histogram of the percentage of active gene set/stimulus pairs [560 pairs from 6396 (246 gene sets × 26 stimuli)] correctly predicted by N teams. Blue line represents the cumulative of the histogram values. (B) Distribution of teams’ Prg (blue) and Prs (red) values. (C and D) Best predicted gene sets as measured by Prg. (C) Barplot of 25 gene sets having a Prg Z-score ≥ 1.9. Blue star indicates a Sg Z-score ≥ 1.5. All gene sets are originally derived from Reactome unless otherwise indicated, according to MSigDB. (D) Hierarchical clustering of gene sets and genes that are present in at least 4 of the top 25 best predicted gene sets. Each cell is valued according to gene set membership and frequency the gene is found as part of that gene set’s GSEA CORE enrichment set. Gene/gene set pairs are assigned a 0 if the gene is not a member, 1 if only a member or 1 + C, where C is the number of stimuli under which the gene is found to be part of the CORE enrichment. Cells have a theoretical maximum value of 27. Cells are represented by a blue scale ranging from dark blue for 0 to white for the maximum value reached, here 7. Significantly overrepresented genes among these gene sets are labeled red (P-value < 0.01) or yellow (P-value < 0.05)

Similar articles

See all similar articles

Cited by 20 articles

See all "Cited by" articles

References

    1. Alleyne TM, et al. Predicting the binding preference of transcription factors to individual DNA k-mers. Bioinformatics. 2009;25:1012–1018. - PMC - PubMed
    1. Anvar SY, et al. Interspecies translation of disease networks increases robustness and predictive accuracy. PLoS Computat. Biol. 2011;7:e1002258. - PMC - PubMed
    1. Barabasi AL, Oltvai ZN. Network biology: understanding the cell's functional organization. Nat. Rev. Genet. 2004;5:101–113. - PubMed
    1. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Methodol. 1995;57:289–300.
    1. Biehl M, et al. Inter-species prediction of protein phosphorylation in the sbv IMPROVER species translation challenge. Bioinformatics. 2015;31:453–461. - PMC - PubMed

Publication types

MeSH terms

Feedback