Background: C. elegans is an important model for genetic studies relevant to human biology and disease. We sought to assess the orthology between C. elegans and human genes to understand better the relationship between their genomes and to generate a compelling list of candidates to streamline RNAi-based screens in this model.
Results: We performed a meta-analysis of results from four orthology prediction programs and generated a compendium, "OrthoList", containing 7,663 C. elegans protein-coding genes. Various assessments indicate that OrthoList has extensive coverage with low false-positive and false-negative rates. Part of this evaluation examined the conservation of components of the receptor tyrosine kinase, Notch, Wnt, TGF-ß and insulin signaling pathways, and led us to update compendia of conserved C. elegans kinases, nuclear hormone receptors, F-box proteins, and transcription factors. Comparison with two published genome-wide RNAi screens indicated that virtually all of the conserved hits would have been obtained had just the OrthoList set (∼38% of the genome) been targeted. We compiled Ortholist by InterPro domains and Gene Ontology annotation, making it easy to identify C. elegans orthologs of human disease genes for potential functional analysis.
Conclusions: We anticipate that OrthoList will be of considerable utility to C. elegans researchers for streamlining RNAi screens, by focusing on genes with apparent human orthologs, thus reducing screening effort by ∼60%. Moreover, we find that OrthoList provides a useful basis for annotating orthology and reveals more C. elegans orthologs of human genes in various functional groups, such as transcription factors, than previously described.