Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 40 (Database issue), D136-43

The NCBI Taxonomy Database


The NCBI Taxonomy Database

Scott Federhen. Nucleic Acids Res.


The NCBI Taxonomy database ( is the standard nomenclature and classification repository for the International Nucleotide Sequence Database Collaboration (INSDC), comprising the GenBank, ENA (EMBL) and DDBJ databases. It includes organism names and taxonomic lineages for each of the sequences represented in the INSDC's nucleotide and protein sequence databases. The taxonomy database is manually curated by a small group of scientists at the NCBI who use the current taxonomic literature to maintain a phylogenetic taxonomy for the source organisms represented in the sequence databases. The taxonomy database is a central organizing hub for many of the resources at the NCBI, and provides a means for clustering elements within other domains of NCBI web site, for internal linking between domains of the Entrez system and for linking out to taxon-specific external resources on the web. Our primary purpose is to index the domain of sequences as conveniently as possible for our user community.


Figure 1.
Figure 1.
(a) Total growth of the taxonomy database. This includes formal and informal taxa at all levels, from unranked isolate-level taxids added for the influenza genome project to genera, families and higher taxa. (b) Valid species in the taxonomy database. This includes only valid binomial and trinomial species, subspecies, varietas and forma (infraspecific taxa with standing in the nomenclature). The viruses and bacteria are basically flat in this figure, since the rate-limiting step is the description of new species, not the sequencing.
Figure 2.
Figure 2.
The taxonomy portlet in Nucleotide Entrez. This particular display summarizes the taxonomic distribution of plant sequences released in 2011, given by the Entrez query viridiplantae[orgn] AND 2011[pdat].[orgn]+AND+2011[pdat] The taxonomy portlet toggles between a list of top taxa by entry count in the Entrez results list, and the taxonomic overview shown above.
Figure 3.
Figure 3.
Taxonomy browser page for the Mammalia. Exploded and unexploded links to other Entrez database are shown in ‘Entrez records’. LinkOut links to external databases are displayed below the Comments and References (data not shown).

Similar articles

  • Type Material in the NCBI Taxonomy Database
    S Federhen. Nucleic Acids Res 43 (Database issue), D1086-98. PMID 25398905.
    Type material is the taxonomic device that ties formal names to the physical specimens that serve as exemplars for the species. For the prokaryotes these are strains subm …
  • GenBank
    DA Benson et al. Nucleic Acids Res 41 (Database issue), D36-42. PMID 23193287.
    GenBank® ( is a comprehensive database that contains publicly available nucleotide sequences for almost 260 000 formally described species. Th …
  • CDD: A Conserved Domain Database for Protein Classification
    A Marchler-Bauer et al. Nucleic Acids Res 33 (Database issue), D192-6. PMID 15608175.
    The Conserved Domain Database (CDD) is the protein classification component of NCBI's Entrez query and retrieval system. CDD is linked to other Entrez databases such as P …
  • International Union of Pharmacology. XLVI. G Protein-Coupled Receptor List
    SM Foord et al. Pharmacol Rev 57 (2), 279-88. PMID 15914470. - Review
    NC-IUPHAR (International Union of Pharmacology Committee on Receptor Nomenclature and Drug Classification) and its subcommittees provide authoritative reports on the nome …
  • A Review of Criticisms of Phylogenetic Nomenclature: Is Taxonomic Freedom the Fundamental Issue?
    HN Bryant et al. Biol Rev Camb Philos Soc 77 (1), 39-55. PMID 11911373. - Review
    The proposal to implement a phylogenetic nomenclatural system governed by the PhyloCode), in which taxon names are defined by explicit reference to common descent, has me …
See all similar articles

Cited by 280 PubMed Central articles

See all "Cited by" articles


    1. Bairoch A, Boeckmann B. The SWISS-PROT protein sequence data bank. Nucleic Acids Res. 1991;19:2247–2249. - PMC - PubMed
    1. Barker WC, George DG, Hunt LT, Garavelli JS. The PIR protein sequence database. Nucleic Acids Res. 1991;19:2231–2236. - PMC - PubMed
    1. Schuler GD, Epstein JA, Ohkawa H, Kans JA. Entrez: molecular biology database and retrieval system. Methods Enzymol. 1966;266:141–162. - PubMed
    1. Ride WDL, Cogger HG, Dupuis C, Kraus O, Minelli A, Thompson FC, Tubbs PK, editors. International Code of Zoological Nomenclature. 1999. 4th edn. International Trust for Zoological Nomenclature, The Natural History Museum, London (23 November 2011, date last accessed)
    1. McNeill J, Barrie FR, Burdet HM, Demoulin V, Hawksworth DL, Marhold K, Nicolson DH, Prado J, Silva PC, Skog JE, et al., editors. International Code of Botanical Nomenclature (Vienna Code). Regnum Vegetabile. 2006;Vol. 146 A.R.G. Ruggell, Liechtenstein, Gantner Verlag KG. (23 November 2011, date last accessed)

Publication types