Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 20 (1), 244

The CAFA Challenge Reports Improved Protein Function Prediction and New Functional Annotations for Hundreds of Genes Through Experimental Screens

Naihui Zhou  1   2 Yuxiang Jiang  3 Timothy R Bergquist  4 Alexandra J Lee  5 Balint Z Kacsoh  6   7 Alex W Crocker  8 Kimberley A Lewis  8 George Georghiou  9 Huy N Nguyen  1   10 Md Nafiz Hamid  1   2 Larry Davis  2 Tunca Dogan  11   12 Volkan Atalay  13 Ahmet S Rifaioglu  13   14 Alperen Dalkıran  13 Rengul Cetin Atalay  15 Chengxin Zhang  16 Rebecca L Hurto  17 Peter L Freddolino  16   17 Yang Zhang  16   17 Prajwal Bhat  18 Fran Supek  19   20 José M Fernández  21   22 Branislava Gemovic  23 Vladimir R Perovic  23 Radoslav S Davidović  23 Neven Sumonja  23 Nevena Veljkovic  23 Ehsaneddin Asgari  24   25 Mohammad R K Mofrad  26 Giuseppe Profiti  27   28 Castrense Savojardo  27 Pier Luigi Martelli  27 Rita Casadio  27 Florian Boecker  29 Heiko Schoof  30 Indika Kahanda  31 Natalie Thurlby  32 Alice C McHardy  33   34 Alexandre Renaux  35   36   37 Rabie Saidi  12 Julian Gough  38 Alex A Freitas  39 Magdalena Antczak  40 Fabio Fabris  39 Mark N Wass  40 Jie Hou  41   42 Jianlin Cheng  42 Zheng Wang  43 Alfonso E Romero  44 Alberto Paccanaro  44 Haixuan Yang  45   46 Tatyana Goldberg  47 Chenguang Zhao  48   49   50 Liisa Holm  51 Petri Törönen  51 Alan J Medlar  51 Elaine Zosa  52 Itamar Borukhov  53 Ilya Novikov  54 Angela Wilkins  55 Olivier Lichtarge  55 Po-Han Chi  56 Wei-Cheng Tseng  57 Michal Linial  58 Peter W Rose  59 Christophe Dessimoz  60   61   62 Vedrana Vidulin  63 Saso Dzeroski  64   65 Ian Sillitoe  66 Sayoni Das  67 Jonathan Gill Lees  67   68 David T Jones  69   70 Cen Wan  71   69 Domenico Cozzetto  71   69 Rui Fa  71   69 Mateo Torres  44 Alex Warwick Vesztrocy  70   72 Jose Manuel Rodriguez  73 Michael L Tress  74 Marco Frasca  75 Marco Notaro  75 Giuliano Grossi  75 Alessandro Petrini  75 Matteo Re  75 Giorgio Valentini  75 Marco Mesiti  75   76 Daniel B Roche  77 Jonas Reeb  77 David W Ritchie  78 Sabeur Aridhi  78 Seyed Ziaeddin Alborzi  78   79 Marie-Dominique Devignes  78   80   79 Da Chen Emily Koo  81 Richard Bonneau  82   83 Vladimir Gligorijević  84 Meet Barot  85 Hai Fang  86 Stefano Toppo  87 Enrico Lavezzo  87 Marco Falda  88 Michele Berselli  87 Silvio C E Tosatto  89   90 Marco Carraro  90 Damiano Piovesan  90 Hafeez Ur Rehman  91 Qizhong Mao  92   93 Shanshan Zhang  92 Slobodan Vucetic  92 Gage S Black  94   95 Dane Jo  94   95 Erica Suh  94 Jonathan B Dayton  94   95 Dallas J Larsen  94   95 Ashton R Omdahl  94   95 Liam J McGuffin  96 Danielle A Brackenridge  96 Patricia C Babbitt  97   98 Jeffrey M Yunes  99   98 Paolo Fontana  100 Feng Zhang  101   102 Shanfeng Zhu  103   104   105 Ronghui You  103   104   105 Zihan Zhang  103   105 Suyang Dai  103   105 Shuwei Yao  103   104 Weidong Tian  106   107 Renzhi Cao  108 Caleb Chandler  108 Miguel Amezola  108 Devon Johnson  108 Jia-Ming Chang  109 Wen-Hung Liao  109 Yi-Wei Liu  109 Stefano Pascarelli  110 Yotam Frank  111 Robert Hoehndorf  112 Maxat Kulmanov  112 Imane Boudellioua  113   114 Gianfranco Politano  115 Stefano Di Carlo  115 Alfredo Benso  115 Kai Hakala  116   117 Filip Ginter  116   118 Farrokh Mehryary  116   117 Suwisa Kaewphan  116   117   119 Jari Björne  120   121 Hans Moen  118 Martti E E Tolvanen  122 Tapio Salakoski  120   121 Daisuke Kihara  123   124 Aashish Jain  125 Tomislav Šmuc  126 Adrian Altenhoff  127   128 Asa Ben-Hur  129 Burkhard Rost  47   130 Steven E Brenner  131 Christine A Orengo  67 Constance J Jeffery  132 Giovanni Bosco  133 Deborah A Hogan  6   8 Maria J Martin  9 Claire O'Donovan  9 Sean D Mooney  4 Casey S Greene  134   135 Predrag Radivojac  136 Iddo Friedberg  137
Affiliations

The CAFA Challenge Reports Improved Protein Function Prediction and New Functional Annotations for Hundreds of Genes Through Experimental Screens

Naihui Zhou et al. Genome Biol.

Abstract

Background: The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function.

Results: Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory.

Conclusion: We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.

Keywords: Biofilm; Community challenge; Critical assessment; Long-term memory; Protein function prediction.

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
A comparison in Fmax between the top 5 CAFA2 models against the top 5 CAFA3 models. Colored boxes encode the results such that (1) the colors indicate margins of a CAFA3 method over a CAFA2 method in Fmax and (2) the numbers in the box indicate the percentage of wins. a CAFA2 top 5 models (rows, from top to bottom) against CAFA3 top 5 models (columns, from left to right). b Comparison of the performance (Fmax) of Naïve baselines trained respectively on SwissProt2014 and SwissProt2017. Colored box between the two bars shows the percentage of wins and margin of wins as in a. c Comparison of the performance (Fmax) of BLAST baselines trained on SwissProt2014 and SwissProt2017. Colored box between the two bars shows the percentage of wins and margin of wins as in a. Statistical significance was assessed using 10,000 bootstrap samples of benchmark proteins
Fig. 2
Fig. 2
Performance evaluation based on the Fmax for the top CAFA1, CAFA2, and CAFA3 methods. The top 12 methods are shown in this barplot ranked in descending order from left to right. The baseline methods are appended to the right; they were trained on training data from 2017, 2014, and 2011, respectively. Coverage of the methods were shown as text inside the bars. Coverage is defined as the percentage of proteins in the benchmark that are predicted by the methods. Color scheme: CAFA2, ivory; CAFA3, green; Naïve, red; BLAST, blue. Note that in MFO and BPO, CAFA1 methods were ranked, but since none made to the top 12 of all 3 CAFA challenges, they were not displayed. The CAFA1 challenge did not collect predictions for CCO. a: molecular function; b: Biological process; c: Cellular Component
Fig. 3
Fig. 3
Performance evaluation based on the Fmax for the top-performing methods in 3 ontologies. Evaluation was carried out on No knowledge benchmarks in the full mode. ac: bar plots showing the Fmax of the top 10 methods. The 95% confidence interval was estimated using 10,000 bootstrap iterations on the benchmark set. Coverage of the methods was shown as text inside the bars. Coverage is defined as the percentage of proteins in the benchmark which are predicted by the methods. df: precision-recall curves for the top 10 methods. The perfect prediction should have Fmax=1, at the top right corner of the plot. The dot on the curve indicates where the maximum F score is achieved
Fig. 4
Fig. 4
Performance evaluation based on Smin for the top-performing methods in 3 ontologies. Evaluation was carried out on No knowledge benchmarks in the full mode. ac: bar plots showing Smin of the top 10 methods. The 95% confidence interval was estimated using 10,000 bootstrap iterations on the benchmark set. Coverage of the methods was shown as text inside the bars. Coverage is defined as the percentage of proteins in the benchmark which are predicted by the methods. df: remaining uncertainty-missing information (RU-MI) curves for the top 10 methods. The perfect prediction should have Smin=0, at the bottom left corner of the plot. The dot on the curve indicates where the minimum semantic distance is achieved
Fig. 5
Fig. 5
Evaluation based on the Fmax for the top-performing methods in eukaryotic and bacterial species
Fig. 6
Fig. 6
Number of proteins in each benchmark species and ontology
Fig. 7
Fig. 7
Heatmap of similarity for the top 10 methods in CAFA1, CAFA2, and CAFA3. Similarity is represented by Euclidean distance of the prediction scores from each pair of methods, using the intersection set of benchmarks in the “Top methods have improved from CAFA2 to CAFA3, but improvement was less dramatic than from CAFA1 to CAFA2” section. The higher (darker red color) the euclidean distance, the less similar the methods are. Top 10 methods from each of the CAFA challenges are displayed and ranked by their performance in Fmax. Cells highlighted by black borders are between a pair of methods that come from the same PI. a: Molecular Function; b: Biological Process; c: Cellular Component
Fig. 8
Fig. 8
Keyword analysis of all CAFA3 participating methods. ac: both relative frequency of the keywords and weighted frequencies are provided for three respective GO ontologies. The weighted frequencies accounts for the performance of the the particular model using the given keyword. If that model performs well (with high Fmax), then it gives more weight to the calculation of the total weighted average of that keyword. d shows the ratio of relative frequency between the Fmax-weighted and equal-weighted. Red indicates the ratio is greater than one while blue indicates the ratio is less than one. Only the top five keywords ranked by ratio are shown. The larger the ratio, the more difference there is between the Fmax-weighted and the equal-weighted
Fig. 9
Fig. 9
AUROC of the top five teams in CAFA- π. The best-performing model from each team is picked for the top five teams, regardless of whether that model is submitted as model 1. Four baseline models all based on BLAST were computed for Candida, while six baseline models were computed for Pseudomonas, including two based on expression profiles. All team methods are in gray while BLAST methods are in red, BLAST computational methods are in blue, and expression are in yellow, see Table 3 for the description of the baselines
Fig. 10
Fig. 10
AUROC of top five teams in CAFA3. The best-performing model from each team is picked for the top five teams, regardless of whether that model is submitted as model 1. All team methods are in gray while BLAST methods are in red and BLAST computational methods are in blue, see Table 3 for the description of the baselines
Fig. 11
Fig. 11
CAFA participation has been growing. Each principal investigator is allowed to head multiple teams, but each member can only belong to one team. Each team can submit up to three models
Fig. 12
Fig. 12
CAFA3 timeline
Fig. 13
Fig. 13
Experimental procedure of determining genes associated with the functions biofilm formation (a) and motility (b) in P. aeruginosa
Fig. 14
Fig. 14
a: different phenotypes in response to doxycycline treatment: low growth, smooth, no growth and intermediate. b: adherence phenotypes. See text for details
Fig. 15
Fig. 15
AUROC of top 5 teams in CAFA- π. The best-performing model from each team is picked for the top five teams, regardless of whether that model is submitted as model 1. All team methods are in gray while BLAST methods are in red, BLAST computational methods are in blue and expression are in yellow. See Table 3 for description of the baselines

Similar articles

See all similar articles

Cited by 1 article

References

    1. Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet. 2016;17(6):333–51. doi: 10.1038/nrg.2016.49. - DOI - PubMed
    1. Aebersold R, Mann M. Mass spectrometry-based proteomics. Nature. 2003;422(6928):198–207. doi: 10.1038/nature01511. - DOI - PubMed
    1. Schnoes AM, Ream DC, Thorman AW, Babbitt PC, Friedberg I. Biases in the experimental annotations of protein function and their effect on our understanding of protein function space. PLoS Comput Biol. 2013;9(5):1003063. doi: 10.1371/journal.pcbi.1003063. - DOI - PMC - PubMed
    1. Rost B, Liu J, Nair R, Wrzeszczynski KO, Ofran Y. Automatic prediction of protein function. Cell Mol Life Sci. 2003;60(12):2637–50. doi: 10.1007/s00018-003-3114-8. - DOI - PubMed
    1. Friedberg I. Automated protein function prediction–the genomic challenge. Brief Bioinform. 2006;7(3):225–42. doi: 10.1093/bib/bbl004. - DOI - PubMed

Publication types

LinkOut - more resources

Feedback