Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 41 (Database issue), D536-44

DcGO: Database of Domain-Centric Ontologies on Functions, Phenotypes, Diseases and More


DcGO: Database of Domain-Centric Ontologies on Functions, Phenotypes, Diseases and More

Hai Fang et al. Nucleic Acids Res.


We present 'dcGO' (, a comprehensive ontology database for protein domains. Domains are often the functional units of proteins, thus instead of associating ontological terms only with full-length proteins, it sometimes makes more sense to associate terms with individual domains. Domain-centric GO, 'dcGO', provides associations between ontological terms and protein domains at the superfamily and family levels. Some functional units consist of more than one domain acting together or acting at an interface between domains; therefore, ontological terms associated with pairs of domains, triplets and longer supra-domains are also provided. At the time of writing the ontologies in dcGO include the Gene Ontology (GO); Enzyme Commission (EC) numbers; pathways from UniPathway; human phenotype ontology and phenotype ontologies from five model organisms, including plants; anatomy ontologies from three organisms; human disease ontology and drugs from DrugBank. All ontological terms have probabilistic scores for their associations. In addition to associations to domains and supra-domains, the ontological terms have been transferred to proteins, through homology, providing annotations of >80 million sequences covering 2414 complete genomes, hundreds of meta-genomes, thousands of viruses and so forth. The dcGO database is updated fortnightly, and its website provides downloads, search, browse, phylogenetic context and other data-mining facilities.


Figure 1.
Figure 1.
The dcGO website has the ‘Faceted Search’ interface as a hub to mine the resource. By searching against keywords of interest, the user can access the resource in an organized manner and can link to additional analysis tools.
Figure 2.
Figure 2.
Using ‘PSnet’ to cross-link phenotypes and other ontologies based on shared domain-centric annotations. (A) A list of superfamilies and families annotated by a disease term ‘immune system cancer’. (B) The top well-correlated ontological terms are returned for the disease term in this query.
Figure 3.
Figure 3.
Converting genome sequences to knowledge about function, phenotype and disease using the ‘dcGO Predictor’. (A) A batch query facility allows the user to upload up to 1000 sequences for the prediction on function, disease, phenotype and other information, such as enzyme classification, drugs and pathways. (B) The result page provides a summary of the prediction content. New predictions are supported by instantly switching to other ontologies. In addition to the download, the user can also explore predictions for each of the input sequences, such as Q01826 (human SATB1 protein; see next). (C) The domain architecture of the human SATB1 protein is graphically displayed using the SCOP domains at the superfamily level, whereas the bottom panel shows the predicted Disease Ontology terms.

Similar articles

See all similar articles

Cited by 41 PubMed Central articles

See all "Cited by" articles


    1. Friedberg I. Automated protein function prediction—the genomic challenge. Brief. Bioinform. 2006;7:225–242. - PubMed
    1. Thorisson GA, Muilu J, Brookes AJ. Genotype-phenotype databases: challenges and solutions for the post-genomic era. Nat. Rev. Genet. 2009;10:9–18. - PubMed
    1. Butler D. Human genome at ten: science after the sequence. Nature. 2010;465:1000–1001. - PubMed
    1. de Lima Morais DA, Fang H, Rackham OJ, Wilson D, Pethica R, Chothia C, Gough J. SUPERFAMILY 1.75 including a domain-centric gene ontology method. Nucleic Acids Res. 2011;39:D427–D434. - PMC - PubMed
    1. Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W, Goldberg LJ, Eilbeck K, Ireland A, Mungall CJ, et al. The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat. Biotechnol. 2007;25:1251–1255. - PMC - PubMed

Publication types