Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 8 (7), e68464

An Improved Method for TAL Effectors DNA-binding Sites Prediction Reveals Functional Convergence in TAL Repertoires of Xanthomonas Oryzae Strains


An Improved Method for TAL Effectors DNA-binding Sites Prediction Reveals Functional Convergence in TAL Repertoires of Xanthomonas Oryzae Strains

Alvaro L Pérez-Quintero et al. PLoS One.


Transcription Activators-Like Effectors (TALEs) belong to a family of virulence proteins from the Xanthomonas genus of bacterial plant pathogens that are translocated into the plant cell. In the nucleus, TALEs act as transcription factors inducing the expression of susceptibility genes. A code for TALE-DNA binding specificity and high-resolution three-dimensional structures of TALE-DNA complexes were recently reported. Accurate prediction of TAL Effector Binding Elements (EBEs) is essential to elucidate the biological functions of the many sequenced TALEs as well as for robust design of artificial TALE DNA-binding domains in biotechnological applications. In this work a program with improved EBE prediction performances was developed using an updated specificity matrix and a position weight correction function to account for the matching pattern observed in a validation set of TALE-DNA interactions. To gain a systems perspective on the large TALE repertoires from X. oryzae strains, this program was used to predict rice gene targets for 99 sequenced family members. Integrating predictions and available expression data in a TALE-gene network revealed multiple candidate transcriptional targets for many TALEs as well as several possible instances of functional convergence among TALEs.

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.


Figure 1
Figure 1. Performances of the EBE prediction software in the TALE-EBE validation set.
(A) Boxplot showing the median (thick line), the lower and upper quartiles (box) and the minimum and maximum (whiskers) of the prediction scores for the set of positive (+) and negative (−) control TALE-DNA interactions using three programs for EBE prediction. Scores were scaled down according to the maximum score on the set to facilitate comparison. Talent scores were scaled x−1 since they follow an inverse scale relative to the other programs, this transformation maintains data structure. ** Indicates significant positive vs. negative differences (one-tailed t-test p-value<0.001). (B) ROC graph showing the true positive and false positive rate of the three EBE predictors based on validation set screenings. Dashed line indicates the theoretical performance of a random classifying program where true positive rate = false positive rate. The inset in the upper right corner shows the rates for Talvez and Storyteller at a higher scale to highlight the differences between the two programs.
Figure 2
Figure 2. Distribution of perfect matches (PM) in the TALE-EBE validation set.
(A) Box plot of the distribution of the number of perfect RVD-nucleotide matches computed for individual negative and positive control TALE-EBE pairs. (B) Distribution of perfect match frequency of individual control TALE-EBE pairs. The frequency corresponds to the ratio of the number of perfect RVD-nucleotide matches to TALE length expressed in number of RVD. (C) Frequency of perfect matches across TALE-DNA positions. The frequency corresponds to the ratio of the number of perfect RVD-nucleotide at the considered position to the total number of RVD-nucleotide pairs at this position in TALE-EBE pairs of the positive or negative control set. (D) Frequency of perfect RVD-nucleotide match between positions 1 and 15 ( = number of PM/15). (E) Frequency of perfect match for TALE-DNA positions beyond 15 ( = number of PM/(length-15)). The p-value of the corresponding two-tailed Wilcoxon test in this comparison is 0.371. ** significant differences, one-tailed Wilcoxon test p-value<0.001; *** significant differences one-tailed Wilcoxon p-value<1e-7.
Figure 3
Figure 3. Effects of the Talvez position correction parameter on performances.
(A) ROC graph showing true positive and false positive rates obtained by screening the validation set with a range (7–25) of position correction values. The data points for positions above 14th as well as for Talvez without position correction (labeled with “None”) all superpose on the left uppermost point. (B) Rankings of positive control TALE targets among genes predicted to contain EBEs in their promoter regions after screening the Arabidopsis and rice genome with Talvez and position correction parameter value varying between 15 and 19 as well as without position correction. The color coding of the various TALE-target pairs is described in the legend beneath the plot. Rankings for positions above 19 were similar to those without position corrections and were omitted here. The dashed line corresponds to rank values equal to 200.
Figure 4
Figure 4. A TALE-candidate target gene network.
Hierarchical representation of the TALE-candidate target gene network; genes are represented by circles and TALEs by polygons. Only the names of genes and targets discussed in the main text are shown. Alias or common names for TALEs are shown when available or the GeneBank accession number is given instead. Increasing edge thickness indicates better EBE prediction ranking. Interactions previously reported in the literature are highlighted with red edges.
Figure 5
Figure 5. Comparison of the TALE-candidate target gene network with random networks obtained with shuffled TALEs.
Properties of the TALE-gene network are compared to average values from 100 randomized controls (error bars indicate standard deviation): (A) percent frequency distribution of Talvez prediction ranks of TALE-gene pairs, the percentage of top (#1) ranking TALE-gene pairs is indicated for the TALE-gene network. (B) Number of genes and TALEs in the TALE-gene network compared to control random networks.
Figure 6
Figure 6. TALEs from multiple Xoo strains may converge onto three distinct MtN3 gene family members.
Panels (A), (B) and (C) summarize Talvez predictions and expression data respectively for Os11N3, Os12N3 (Xa25) and Os8N3 (Xa13). From top to bottom: data in bar plots derive from our analysis of microarray data from different rice genotypes and 24 hours after infection time points (hpi). Relevant treatments comparisons are indicated above the graphs. logFC values correspond to log2-transformed fold-change ratios. In the Talvez prediction network snapshots, the rank and score values along the edges represent Talvez prediction output for the connected gene (EBE) in target searches for the corresponding TALE. The bottom part of each panel contains a manual alignment of the RVD sequences from TALEs that are predicted to target the gene under consideration in the panel. Individual residues highlighted in bold deviate from the consensus at that position. The locations of the predicted EBEs on the upstream sequences of the rice gene are marked by lines colored following the same pattern as on the RVD alignment. Numbers on the left indicate the distance in base pair between the most upstream nucleotide of the reported sequence and the ATG. TalC from the African Xoo strain BAI3 which has been reported to target Os11N3 was included in panel A to illustrate the notion of convergence on gene susceptibility targets at the level of distinct EBEs.
Figure 7
Figure 7. Possible functional convergence on specific rice TALE targets between X. oryzae pathovars.
Panels (A) and (B) summarize Talvez predictions and expression data respectively for OsHen1 and F3H (LOC_Os03g03034). See legend of Figure 6 for details.
Figure 8
Figure 8. Overlap of candidate rice target gene sets for TALEs from various X. oryzae strains.
Venn diagram of rice genes from the network assigned to distinct sets according to the strain of origin of their cognate TALE(s).

Similar articles

See all similar articles

Cited by 30 PubMed Central articles

See all "Cited by" articles


    1. Bogdanove AJ, Schornack S, Lahaye T (2010) TAL effectors: finding plant genes for disease and defense. Curr Opin Plant Biol 13: 394–401. - PubMed
    1. Scholze H, Boch J (2011) TAL effectors are remote controls for gene activation. Curr Opin Microbiol 14: 47–53. - PubMed
    1. Kay S, Hahn S, Marois E, Hause G, Bonas U (2007) A bacterial effector acts as a plant transcription factor and induces a cell size regulator. Science 318: 648–651. - PubMed
    1. Boch J, Scholze H, Schornack S, Landgraf A, Hahn S, et al. (2009) Breaking the Code of DNA Binding Specificity of TAL-Type III Effectors. Science 326: 1509–1512. - PubMed
    1. Moscou MJ, Bogdanove AJ (2009) A Simple Cipher Governs DNA Recognition by TAL Effectors. Science 326: 1501–1501. - PubMed

Publication types

Grant support

This work was supported by the French Agence Nationale de la Recherche [ANR-2010-GENM-013] and Programme ECOS Nord. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

LinkOut - more resources