Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jan 11:12:19.
doi: 10.1186/1471-2164-12-19.

Comprehensive prediction of chromosome dimer resolution sites in bacterial genomes

Affiliations

Comprehensive prediction of chromosome dimer resolution sites in bacterial genomes

Nobuaki Kono et al. BMC Genomics. .

Abstract

Background: During the replication process of bacteria with circular chromosomes, an odd number of homologous recombination events results in concatenated dimer chromosomes that cannot be partitioned into daughter cells. However, many bacteria harbor a conserved dimer resolution machinery consisting of one or two tyrosine recombinases, XerC and XerD, and their 28-bp target site, dif.

Results: To study the evolution of the dif/XerCD system and its relationship with replication termination, we report the comprehensive prediction of dif sequences in silico using a phylogenetic prediction approach based on iterated hidden Markov modeling. Using this method, dif sites were identified in 641 organisms among 16 phyla, with a 97.64% identification rate for single-chromosome strains. The dif sequence positions were shown to be strongly correlated with the GC skew shift-point that is induced by replicational mutation/selection pressures, but the difference in the positions of the predicted dif sites and the GC skew shift-points did not correlate with the degree of replicational mutation/selection pressures.

Conclusions: The sequence of dif sites is widely conserved among many bacterial phyla, and they can be computationally identified using our method. The lack of correlation between dif position and the degree of GC skew suggests that replication termination does not occur strictly at dif sites.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The phylogenetic distance of XerCD in each organism. The phylogenetic distances of bacterial genomes to three seed organisms, Escherichia coli (Proteobacteria), Bacillus subtilis (Firmicutes) and Frankia alni (Actinobacteria), were calculated as the average of phylogenetic distances of XerC and XerD. Detailed example is given in Additional file 1, Figure S1. A to C are scatter plots of the distances of these genomes to the seed organisms. Axes represent average distances as calculated by ClustalW. A, Distances from Escherichia coli K-12 and Bacillus subtilis 168; B, distance from Escherichia coli K-12 and Frankia alni ACN14a; and C, distance from Bacillus subtilis 168 and Frankia alni ACN14a. Blue represent the genomes of Proteobacteria, green represent Firmicutes, yellow represent Actinobacteria, and the gray marks represent other phyla. All phyla show strong preferences for seeds from the same phylum.
Figure 2
Figure 2
The relationship between dif sites and GC skew. A. Correlation of the GC skew shift-point (corresponding to the replication terminus region, Y-axis) and the locations of dif sequences (X-axis) for genomes with predicted dif sequences. Genomes with no visible GC skew, as indicated by GC skew Index (GCSI) ≤ 0.05, are omitted. Both axes are shown as the relative distance in percentage of half of the genome size (replichore size), from the position directly opposite of the replication origin. For example, 0% means that the position is directly opposite of the replication origin identified by the GC skew shift-point, and 100% means that it is at the replication origin. In other words, the higher the percentage, the closer the distance to the replication origin. Here the positions of GC skew shift-points and dif sites are strongly correlated in all three phyla. B. Lack of correlation between the difference in the positions of GC skew shift-points and dif sites (Y-axis) and the GCSI (X-axis). GCSI is a quantitative measure of the degree of GC skew, where GCSI = 0 is no observable skew, and GCSI = 1 is extremely pronounced skew. Typically GC skew is visible at GCSI ≥ 0.1, and it is pronounced when GCSI ≥ 0.3. Since we see no correlation in these plots, stronger replication-related mutation bias (i.e. larger GCSI) does not necessarily result in closer positions of the GC skew shift-point and the dif site. These results suggest that the replication termination occurs near the dif site, but not at the dif site. The number of dif sites is 517 in all bacteria, 438 in Proteobacteria and 97 in Firmicutes. The ρ in this figure is Spearman's rank-correlation coefficient.
Figure 3
Figure 3
Phylogenetic tree based on rRNA for the comparison of XerCD- and XerH-containing genomes. This phylogenetic tree is constructed using the maximum-likelihood method and is based on 16S rRNAs of 14 organisms in ε-Proteobacteria, whose dif sequences are predicted in this study. The outgroup is Escherichia coli K12.
Figure 4
Figure 4
Prediction strategy. A. Example of the iterated HMM in Proteobacteria. The first seed profile hidden Markov model is created from the seed dif sequence of Escherichia coli, by searching for dif sequences in 28 genomes belonging to the genus Escherichia by means of fuzzy matching. Based on this initial profile hidden Markov model, dif sequences were predicted in the genomes of the closest genus to the Escherichia genus (in this case, Shigella) according to XerCD amino acid sequences. Subsequently, a new profile is created using the previous profile and the newly predicted dif sequences, and this new profile is used to predict in the second closest genus (in this case, Salmonella). In this way, profile creation and dif sequence prediction were repeated recursively in decreasing order of similarity of XerCD from the Escherichia sequence. In this way, iterated HMM is conducted for each phylum. B. Flow chart of the overall strategy.
Figure 5
Figure 5
The conservation of dif sequences. This figure shows the conservation quantities at each position of dif sequence in each phylum or class (Proteobacteria, Firmicutes, Actinobacteria, Bacteroidetes, α-Proteobacteria, β-Proteobacteria, γ-Proteobacteria, and δ-Proteobacteria). The black bars represent the degree of conservation in single-chromosome genomes, and the gray bars represent that of organisms harboring multiple chromosomes. The labels "XerC domain" and "XerD domain" in these graphs represent the binding sites of these proteins. The X-axis represents the nucleotide positions in the dif sequence, and the Y-axis represents the nucleotide conservation quantity. Y-axis values were normalized to percentages.

Similar articles

Cited by

References

    1. Michel B, Grompone G, Florès MJ, Bidnenko V. Multiple pathways process stalled replication forks. Proc Natl Acad Sci USA. 2004;101:12783–12788. doi: 10.1073/pnas.0401586101. - DOI - PMC - PubMed
    1. Lesterlin C, Barre FX, Cornet F. Genetic recombination and the cell cycle: What we have learned from chromosome dimers. Mol Microbiol. 2004;54:1151–1160. doi: 10.1111/j.1365-2958.2004.04356.x. - DOI - PubMed
    1. Sherratt D. Bacterial chromosome dynamics. Science. 2003;301:780–785. doi: 10.1126/science.1084780. - DOI - PubMed
    1. Blakely G, May G, McCulloch R, Arciszewska LK, Burke M, Lovett ST, Sherratt DJ. Two related recombinases are required for site-specific recombination at dif and cer in E. coli K12. Cell. 1993;75:351–361. doi: 10.1016/0092-8674(93)80076-Q. - DOI - PubMed
    1. Clerget M. Site-specific recombination promoted by a short DNA segment of plasmid R1 and by a homologous segment in the terminus region of the Escherichia coli chromosome. New Biol. 1991;3:780–788. - PubMed

Publication types

MeSH terms

LinkOut - more resources