Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Aug 22;178(5):1260-1272.e14.
doi: 10.1016/j.cell.2019.07.038.

A Species-Wide Inventory of NLR Genes and Alleles in Arabidopsis Thaliana

Affiliations
Free PMC article

A Species-Wide Inventory of NLR Genes and Alleles in Arabidopsis Thaliana

Anna-Lena Van de Weyer et al. Cell. .
Free PMC article

Abstract

Infectious disease is both a major force of selection in nature and a prime cause of yield loss in agriculture. In plants, disease resistance is often conferred by nucleotide-binding leucine-rich repeat (NLR) proteins, intracellular immune receptors that recognize pathogen proteins and their effects on the host. Consistent with extensive balancing and positive selection, NLRs are encoded by one of the most variable gene families in plants, but the true extent of intraspecific NLR diversity has been unclear. Here, we define a nearly complete species-wide pan-NLRome in Arabidopsis thaliana based on sequence enrichment and long-read sequencing. The pan-NLRome largely saturates with approximately 40 well-chosen wild strains, with half of the pan-NLRome being present in most accessions. We chart NLR architectural diversity, identify new architectures, and quantify selective forces that act on specific NLRs and NLR domains. Our study provides a blueprint for defining pan-NLRomes.

Keywords: NLR; RenSeq; SMRT sequencing; disease resistance genes; genomics; innate immunity; integrated domains; plant immunity; sequence capture; targeted enrichment.

Conflict of interest statement

The authors declare no competing interests.

Figures

None
Figure 1
Figure 1
Overview of NLR Complements in 64 Accessions (A) Accession provenance. 1001 Genomes relicts, non-relicts, and MAGIC founders. (B) Total number (yellow) as well as number of clustered (rose) and paired (purple) NLRs in each accession. Solid black lines, means; transparent horizontal bands, Bayesian 95% highest density intervals (HDIs); circles, individual data; full densities shown as bean plots. (C) Number of NLRs in different structural classes in accessions. Orange, TNLs; green, NLs; blue, CNLs; purple, RNLs (purple). Related to Figure S1 and Table S1.
Figure S1
Figure S1
NLR Frequency for Different Subclasses, Related to Figure 1 NLRs are grouped by domain content: T (TIR), N (NB), C (CC), R (CCR), and X (all IDs). Domains in parentheses are not present in all members of that group. Domain order is not considered. Mean is shown as a solid black horizontal line and the 95% Highest Density Intervals (HDI; points in the interval have a higher probability than points outside) are shown as transparent bands around the sample mean. Individual data points plotted as open circles and full densities shown as bean plots.
Figure 2
Figure 2
Diversity of IDs and Domain Architectures (A) UpSet intersection of IDs in the Col-0 reference accession, pan-NLRome, and 19 other Brassicaceae. (B) ID distribution, with IDs not reported before from A. thaliana in blue and previously known IDs in green. Asterisks indicate IDs not reported before from other Brassicaceae. (C) Cumulative contribution to the pan-NLRome by different domain architectures, ranked from largest to smallest. (D) UpSet intersection of architectures shared between Col-0 reference accession, pan-NLRome, and 19 other Brassicaceae. Darker colors indicate architectures with IDs. (E) 38 new A. thaliana architectures not found in the Col-0 reference and represented by more than one gene. Asterisks indicate architectures also not found in 19 other Brassicaceae. (F) Newly described (blue) and previously known (green) architectures containing the 27 overlapping Brassicaceae IDs (see A). “a and “b” indicate IDs as defined in (Kroj et al., 2016) and (Sarris et al., 2016), respectively. Related to Figure S2 and Table S2.
Figure S2
Figure S2
Schematic Representation of NLR Domain Architecture Diversity and Simplification of Consecutively Repeated Domains, Related to Figure 2 (A) Examples of NLR domain architecture diversity. (B) Reduction of TNL domain combinations by collapsing duplicated/repetitive domains. Analogous strategies were applied to CNL, RNL and NL classes. (C) Full set of NLR architectures not described before for A. thaliana, including architectures found in only one gene. Asterisks indicate 49 architectures not reported from other Brassicaceae, or in the reference accession Col-0.
Figure 3
Figure 3
Orthogroup Sizes, Saturation, and Distribution of Core, Shell, and Cloud NLRs (A) OG size distribution (without singleton OGs). (B) Saturation of pan-NLRome discovery. Blue indicates fractions of pan-NLRome that can be recovered from randomly drawn sets of accessions of different sizes (with 1,000× bootstrapping). Horizontal dashed line indicates 90% of pan-NLRome discovered. Green indicates average sizes of OG that remain undiscovered with accession sets of different sizes. Vertical dashed line indicates that 95% of the pan-NLRome can be recovered with 38 accessions (1,000 bootstraps). (C) OG-type-specific distribution of NLR classes in cloud (brown), shell (green), and the core pan-NLRome (blue). Percentages for each on top. (D) OG-type-specific distribution of paired and unpaired NLRs and NLRs with and without IDs in cloud (brown), shell (green), and core (blue). Percentages on top. (E–H) Comparison of OG size density distributions across different contrasting NLR subsets. The blue and green numbers denote the total number of OGs in the cloud, shell, and core for each of the four contrasting subsets shown. Gray bands indicate the ranges in which the OG size density distributions would not be significantly different from each other, determined with a bootstrap approach.
Figure S3
Figure S3
OG Combinations, Related to Figure 4 (A) Co-occurrences network for NLR (no prefix) and non-NLR (prefix “non_”) OGs on the same contigs in 10 accessions. Similar networks were found for higher or lower thresholds. Blue boxes highlight NLR OGs without a Col-0 allele, orange boxes highlight paired OGs without a Col-0 allele. (B) Co-occurrence of the paired, head-to-head NLRs OG205 (TCP-B3-TIR-NB-LRR-Zf) and OG204 (TIR-NB-LRR), which are not found in Col-0 or in Ler. Grey, non-NLR OGs.
Figure 4
Figure 4
Genomic Location of NLR Genes in the Reference Assembly The five A. thaliana chromosomes are shown as horizontal bars with centromeres in gray, and reference NLRs are shown as black line segments. Text labels are shown only for functionally defined Col-0 NLRs. Anchored OGs found in at least 10 accessions are shown below each chromosome. Orange, paired OGs; blue, other anchored OGs. Related to Figure S3 and Table S3C.
Figure S4
Figure S4
Saturation of Diversity Discovery and PCAs of Population Genetics Statistics, Related to Figure 5 (A and B) Fraction of nucleotide and haplotype diversity that can be recovered from a randomly drawn set of accessions with different set sizes (with 1000x bootstrapping). Horizontal dashed lines indicate 90% of diversity found. Vertical dashed line indicates number of accessions with which 95% of diversity can be recovered (1,000 bootstraps). (C) Principal component analysis carried out on 10 population genetics statistics, nucleotide diversity (pi), haplotype diversity, Fu and Li’s D, Fu and Li’s F, Tajima’s D, Rozas’ R2, Strobeck’s S and number of segregating sites.
Figure 5
Figure 5
Diversity and Selection across the Pan-NLRome RNL OGs are not shown because of the low number of OGs in this class. (A) Nucleotide diversity (average pairwise nucleotide differences) by OG type and NLR class. (B) Haplotype diversity (average pairwise haplotype differences) by OG type and NLR class. Large values indicate a high chance of finding two different haplotypes when two randomly chosen members of a given OG are compared. (C) Nucleotide diversity distribution in different domain types. The NL class included a few OGs where a minority of members had an identifiable CC domain; hence the CC class and the NL class overlapped. (D) Tajima’s D, a measure of genetic selection, by OG type and NLR class. Related to Figure S4.
Figure 6
Figure 6
Selection Landscape of the Pan-NLRome (A–E) Fraction of different positive selection categories grouped by NLR class (A), OG type (B), ID status (C), paired NLR status (D), or NLR subclass (E). An OG was considered if at least one positive selected site of a given class was detectable. (F–H) Fractions of OGs inferred to be under constant (F), pervasive (G), or episodic (H) selection or without positive selection detected, grouped by annotated protein domains. Related to Figure S5.
Figure S5
Figure S5
Positive Selection Landscape of the Pan-NLRome, Related to Figure 6 (A–E) Number of OGs in different selection classes grouped by NLR class (A), OG type (B), ID status (C), paired NLR status (D), or NLR subclass (E). An OG was considered if at least one positive selected site of a given class was detectable. (F) NLR coverage with different types of positively selected sites. (G–I) Domain coverage with positively selected sites.
Figure 7
Figure 7
Effects of Pathogen Lifestyle on Diversity and NLR Pairing on Selection and Co-evolution (A) Effect of pathogen lifestyle on nucleotide diversity for characterized resistance genes. Gray floating text indicates examples for each category. (B) Correlation of Tajima’s D values in sensor-executor and other pairs. (C) Maximum-likelihood phylogenetic trees of two OGs 91 and 130, which form a sensor-executor pair (Xu et al., 2015). Bootstrap support (100 iterations) indicated at major nodes. OG130 includes a clade with LIM and DA1-like IDs and a clade without. Scale bar indicates substitutions per site. Genes from the same accession are connected by lines, with solid lines indicating presence on the same assembly contig. Related to Figure S6.
Figure S6
Figure S6
Phylogenetic Tree of NB Domain Alignments of TNLs to Delineate Sensor-Executor Pairs, Related to Figure 7 The paired NLRs RPS4-like (executors, silver) and RRS1-like (sensors, gold) as well as SOC3-like (executors, light purple) and CHS1-like (sensors, brick) defined distinct subclades of TNLs. The NJ phylogeny was built from manually refined MUSCLE alignments of NB domains (∼240 amino acids) of Col-0 proteins plus selected additional representatives of OGs inferred to be paired, but absent from the Col-0 reference. NB domains from human APAF1 (green) and the A. thaliana CNL AT1G58602 (blue) were included as outgroups. The WAG maximum likelihood method allowing for 3 discrete Gamma categories was used. AT4G36140 contains two distinct NB domains, both of which were included; the second NB domain of AT4G36140 groups with other RRS1-like NB domains. Support from 100 bootstraps shown at major nodes. Scale bar indicates substitutions per site.

Similar articles

See all similar articles

Cited by 4 articles

References

    1. 1001 Genomes Consortium 1,135 Genomes Reveal the Global Pattern of Polymorphism in Arabidopsis thaliana. Cell. 2016;166:481–491. - PMC - PubMed
    1. Aberer A.J., Kobert K., Stamatakis A. ExaBayes: massively parallel bayesian tree inference for the whole-genome era. Mol. Biol. Evol. 2014;31:2553–2556. - PMC - PubMed
    1. Allen R.L., Bittner-Eddy P.D., Grenville-Briggs L.J., Meitz J.C., Rehmany A.P., Rose L.E., Beynon J.L. Host-parasite coevolutionary conflict between Arabidopsis and downy mildew. Science. 2004;306:1957–1960. - PubMed
    1. Asai S., Furzer O.J., Cevik V., Kim D.S., Ishaque N., Goritschnig S., Staskawicz B.J., Shirasu K., Jones J.D.G. Publisher Correction: A downy mildew effector evades recognition by polymorphism of expression and subcellular localization. Nat. Commun. 2019;10:174. - PMC - PubMed
    1. Baggs E., Dagdas G., Krasileva K.V. NLR diversity, helpers and integrated domains: making sense of the NLR IDentity. Curr. Opin. Plant Biol. 2017;38:59–67. - PubMed

Publication types

LinkOut - more resources

Feedback