Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2005 Sep 23;1-23.
doi: 10.1895/wormbook.1.29.1.

Genomic Classification of Protein-Coding Gene Families

Affiliations
Free PMC article
Review

Genomic Classification of Protein-Coding Gene Families

Erich M Schwarz. WormBook. .
Free PMC article

Abstract

This chapter reviews analytical tools currently in use for protein classification, and gives an overview of the C. elegans proteome. Computational analysis of proteins relies heavily on hidden Markov models of protein families. Proteins can also be classified by predicted secondary or tertiary structures, hydrophobic profiles, compositional biases, or size ranges. Strictly orthologous protein families remain difficult to identify, except by skilled human labor. The InterPro and NCBI KOG classifications encompass 79% of C. elegans protein-coding genes; in both classifications, a small number of protein families account for a disproportionately large number of genes. C. elegans protein-coding genes include at least approximately 12,000 orthologs of C. briggsae genes, and at least approximately 4,400 orthologs of non-nematode eukaryotic genes. Some metazoan proteins conserved in other nematodes are absent from C. elegans. Conversely, 9% of C. elegans protein-coding genes are conserved among all metazoa or eukaryotes, yet have no known functions.

Similar articles

See all similar articles

Cited by 8 articles

See all "Cited by" articles

Publication types

MeSH terms

LinkOut - more resources

Feedback