Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 193 (15), 3964-77

Lateral Transfer of Genes and Gene Fragments in Staphylococcus Extends Beyond Mobile Elements

Affiliations

Lateral Transfer of Genes and Gene Fragments in Staphylococcus Extends Beyond Mobile Elements

Cheong Xin Chan et al. J Bacteriol.

Abstract

The widespread presence of antibiotic resistance and virulence among Staphylococcus isolates has been attributed in part to lateral genetic transfer (LGT), but little is known about the broader extent of LGT within this genus. Here we report the first systematic study of the modularity of genetic transfer among 13 Staphylococcus genomes covering four distinct named species. Using a topology-based phylogenetic approach, we found, among 1,354 sets of homologous genes examined, strong evidence of LGT in 368 (27.1%) gene sets, and weaker evidence in another 259 (19.1%). Within-gene and whole-gene transfer contribute almost equally to the topological discordance of these gene sets against a reference phylogeny. Comparing genetic transfer in single-copy and in multicopy gene sets, we observed a higher frequency of LGT in the latter, and a substantial functional bias in cases of whole-gene transfer (little such bias was observed in cases of fragmentary genetic transfer). We found evidence that lateral transfer, particularly of entire genes, impacts not only functions related to antibiotic, drug, and heavy-metal resistance, as well as membrane transport, but also core informational and metabolic functions not associated with mobile elements. Although patterns of sequence similarity support the cohesion of recognized species, LGT within S. aureus appears frequently to disrupt clonal complexes. Our results demonstrate that LGT and gene duplication play important parts in functional innovation in staphylococcal genomes.

Figures

Fig. 1.
Fig. 1.
Size distribution of the 1,354 gene sets examined in the present study.
Fig. 2.
Fig. 2.
Testing the source of synologs in multicopy gene sets. The illustrated example is a set with a single synolog, in which genome A has two gene copies, a1 and a2. The gene copies are dereplicated in all combinations, yielding two sets of alignments in which each genome is represented only once. A Bayesian phylogenetic tree was constructed for each of these alignments, and the tree was compared against the reference species phylogeny. There are four possible outcomes in such a comparison with respect to the reference phylogeny: (a) both trees of the de-replicated alignments are concordant, suggesting that both a1 and a2 are paralogs; (b and c) one of the trees is concordant, while the other is discordant, suggesting that a1 and a2 have different evolutionary trajectories, i.e., a1 is a paralog, a2 is a xenolog, or vice versa; and (d) both trees are discordant, suggesting that both a1 and a2 have a complex history, i.e., these copies are xenologs following gene duplication or paralogs following gene transfer, of which the order of the events is undetermined. The labels for each inference of synology are shown for multicopy gene sets (MC) and in cases of within-gene transfer (FragGT) and whole-gene transfer (WholeGT). (Modified from Fig. 2 in reference .)
Fig. 3.
Fig. 3.
Representation of functional categories assigned to protein sequences corresponding to gene sets in Staphylococcus species that show evidence of within-gene (fragmentary) genetic transfer (open bars) for single-copy gene sets (SC-FragGT) (a) and multicopy gene sets (MC-FragGT) (b). The solid bars show these same functional categories in the full data set (1,354 sets, 13,297 proteins). Categories are numbered differently for panels a and b as shown in the boxes. Functional categories that are over-represented in panel a but under-represented in panel b, or vice versa, are indicated in boldface. The significance of over- or under-representation is represented by single (P ≤ 0.05) and double (P ≤ 0.01) asterisks.
Fig. 4.
Fig. 4.
Reference tree used in the present study, and three instances of tree topologies showing history of whole-gene transfer: the supertree for the 13 staphylococcal genomes (methicillin-sensitive S. aureus isolates are marked with an asterisk), rooted based on a previous phylogenetic study of Staphylococcus species using small subunit rRNA genes (99) that gives S. saprophyticus as the outgroup (a), and tree topologies for gene sets encoding for glyoxalase/bleomycin resistance protein/dioxygenases (gene set 445), an instance of MC-WholeGT-P (the synolog is a paralog) (b); glucosamine-6-phosphate isomerase (gene set 1167), an instance of MC-WholeGT-P/X (the synolog is either a paralog or a xenolog, but not both) (c); and phage transcriptional activator (gene set 267), an instance of MC-WholeGT-PX (the synolog is a paralog and/or a xenolog) (d). GenBank GI numbers are shown for all sequences implicated in each of the phylogenies b through d, in which a sequence from Bacillus cereus or B. thuringiensis is used as an outgroup. Labels of different genome isolates follow the description in Table 1. Bayesian posterior probability values of ≥0.50 are shown at the internal nodes.
Fig. 5.
Fig. 5.
Representation of functional categories assigned to protein sequences corresponding to gene sets in Staphylococcus species that show evidence of whole-gene transfer (□) for single-copy gene sets (SC-WholeGT) (a) and multicopy gene sets (MC-WholeGT) (b). The solid bars (▩) show the same functional categories in the full data set (1,354 sets, 13,297 proteins). Categories are numbered differently for panels a and b shown in the boxes. Functional categories that are over-represented in panel a but under-represented in panel b, or vice versa, are indicated in boldface. The significance of over- or under-representation is represented by single (P ≤ 0.05) and double (P ≤ 0.01) asterisks.
Fig. 6.
Fig. 6.
Tendency of domon disruption by LGT as observed in single-copy gene sets (SC-FragGT) (a) and multicopy gene sets (MC-FragGT) (b) in Staphylococcus genomes. Shown, respectively, for each in panels a and b are the observed ρ values (large ρ values indicate little domon disruption) (i) and the distributions of D (ii) and P (iii) values generated via 10,000 Kolmogorov-Smirnov tests to examine statistical differences between the distributions of observed (randomly subsampled) and expected (uniformly distributed) ρ values.
Fig. 7.
Fig. 7.
Genome affinities of the 13 staphylococcal genomes based on analysis of sequence similarity. The bar for each of the 13 genomes depicts the percentage of proteins (numbers are shown in the bar) showing high similarity to sequences in other clonal complexes (CCs; inter-CC), within the same CC (intra-CC), within the same species (intraspecies), and in other species (interspecies), as well as showing no matches to other staphylococcal isolates or species (see Table S5 in the supplemental material). Labels of different genome isolates follow the description in Table 1. Instances of inter-CC and intra-CC genome affinity apply only to S. aureus. In these cases (the top nine bars representing the S. aureus genomes), intraspecies genome affinities equal the sum of cases for inter-CC and intra-CC.

Similar articles

See all similar articles

Cited by 17 articles

See all "Cited by" articles

LinkOut - more resources

Feedback