Although strand-biased gene distribution (SGD) was described some two decades ago, the underlying molecular mechanisms and their relationship remain elusive. Its facets include, but are not limited to, the degree of biases, the strand-preference of genes, and the influence of background nucleotide composition variations. Using a dataset composed of 364 non-redundant bacterial genomes, we sought to illustrate our current understanding of SGD. First, when we divided the collection of bacterial genomes into non-polC and polC groups according to their possession of DnaE isoforms that correlate closely with taxonomy, the SGD of the polC group stood out more significantly than that of the non-polC group. Second, when examining horizontal gene transfer, coupled with gene functional conservation (essentiality) and expressivity (level of expression), we realized that they all contributed to SGD. Third, we further demonstrated a weaker G-dominance on the leading strand of the non-polC group but strong purine dominance (both G and A) on the leading strand of the polC group. We propose that strand-biased nucleotide composition plays a decisive role for SGD since the polC-bearing genomes are not only AT-rich but also have pronounced purine-rich leading strands, and we believe that a special mutation spectrum that leads to a strong purine asymmetry and a strong strand-biased nucleotide composition coupled with functional selections for genes and their functions are both at work.
Copyright © 2012. Published by Elsevier Ltd.