Variation, evolution, and correlation analysis of C+G content and genome or chromosome size in different kingdoms and phyla

PLoS One. 2014 Feb 13;9(2):e88339. doi: 10.1371/journal.pone.0088339. eCollection 2014.


C+G content (GC content or G+C content) is known to be correlated with genome/chromosome size in bacteria but the relationship for other kingdoms remains unclear. This study analyzed genome size, chromosome size, and base composition in most of the available sequenced genomes in various kingdoms. Genome size tends to increase during evolution in plants and animals, and the same is likely true for bacteria. The genomic C+G contents were found to vary greatly in microorganisms but were quite similar within each animal or plant subkingdom. In animals and plants, the C+G contents are ranked as follows: monocot plants>mammals>non-mammalian animals>dicot plants. The variation in C+G content between chromosomes within species is greater in animals than in plants. The correlation between average chromosome C+G content and chromosome length was found to be positive in Proteobacteria, Actinobacteria (but not in other analyzed bacterial phyla), Ascomycota fungi, and likely also in some plants; negative in some animals, insignificant in two protist phyla, and likely very weak in Archaea. Clearly, correlations between C+G content and chromosome size can be positive, negative, or not significant depending on the kingdoms/groups or species. Different phyla or species exhibit different patterns of correlation between chromosome-size and C+G content. Most chromosomes within a species have a similar pattern of variation in C+G content but outliers are common. The data presented in this study suggest that the C+G content is under genetic control by both trans- and cis- factors and that the correlation between C+G content and chromosome length can be positive, negative, or not significant in different phyla.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Archaea / classification
  • Archaea / genetics
  • Bacteria / classification
  • Bacteria / genetics
  • Base Composition*
  • Biological Evolution*
  • Chromosomes*
  • Genome Size*
  • Phylogeny
  • Plants / classification
  • Plants / genetics
  • Statistics, Nonparametric

Grant support

The research was supported by LXQ’s research funding from Agriculture and Agri-Food Canada potato genomics project, Canadian Food Inspection Agency CHA-P-1101 project, and New Brunswick EARI Growing Forward Program. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.