NCBI RefSeq genome collection http://www.ncbi.nlm.nih.gov/genome represents all three major domains of life: Eukarya, Bacteria and Archaea as well as Viruses. Prokaryotic genome sequences are the most rapidly growing part of the collection. During the year of 2014 more than 10,000 microbial genome assemblies have been publicly released bringing the total number of prokaryotic genomes close to 30,000. We continue to improve the quality and usability of the microbial genome resources by providing easy access to the data and the results of the pre-computed analysis, and improving analysis and visualization tools. A number of improvements have been incorporated into the Prokaryotic Genome Annotation Pipeline. Several new features have been added to RefSeq prokaryotic genomes data processing pipeline including the calculation of genome groups (clades) and the optimization of protein clusters generation using pan-genome approach.
Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by US Government employees and is in the public domain in the US.