Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov;563(7729):121-125.
doi: 10.1038/s41586-018-0619-8. Epub 2018 Oct 17.

Genome Organization and DNA Accessibility Control Antigenic Variation in Trypanosomes

Free PMC article

Genome Organization and DNA Accessibility Control Antigenic Variation in Trypanosomes

Laura S M Müller et al. Nature. .
Free PMC article


Many evolutionarily distant pathogenic organisms have evolved similar survival strategies to evade the immune responses of their hosts. These include antigenic variation, through which an infecting organism prevents clearance by periodically altering the identity of proteins that are visible to the immune system of the host1. Antigenic variation requires large reservoirs of immunologically diverse antigen genes, which are often generated through homologous recombination, as well as mechanisms to ensure the expression of one or very few antigens at any given time. Both homologous recombination and gene expression are affected by three-dimensional genome architecture and local DNA accessibility2,3. Factors that link three-dimensional genome architecture, local chromatin conformation and antigenic variation have, to our knowledge, not yet been identified in any organism. One of the major obstacles to studying the role of genome architecture in antigenic variation has been the highly repetitive nature and heterozygosity of antigen-gene arrays, which has precluded complete genome assembly in many pathogens. Here we report the de novo haplotype-specific assembly and scaffolding of the long antigen-gene arrays of the model protozoan parasite Trypanosoma brucei, using long-read sequencing technology and conserved features of chromosome folding4. Genome-wide chromosome conformation capture (Hi-C) reveals a distinct partitioning of the genome, with antigen-encoding subtelomeric regions that are folded into distinct, highly compact compartments. In addition, we performed a range of analyses-Hi-C, fluorescence in situ hybridization, assays for transposase-accessible chromatin using sequencing and single-cell RNA sequencing-that showed that deletion of the histone variants H3.V and H4.V increases antigen-gene clustering, DNA accessibility across sites of antigen expression and switching of the expressed antigen isoform, via homologous recombination. Our analyses identify histone variants as a molecular link between global genome architecture, local chromatin conformation and antigenic variation.

Conflict of interest statement

The authors declare no competing interests.


Fig. 1
Fig. 1. Long-read and Hi-C-based de novo assembly of the T. brucei Lister 427 genome.
Only one of the two homologous chromosomes (chr.) is depicted for the homozygous chromosomal core regions (22.71 Mb). Both chromosomes are shown for the heterozygous subtelomeric regions (19.54 Mb). Relative transcript levels (window size, 5,001 bp; step size, 101 bp) are shown as a black line above each chromosome. BESs and MESs were assigned to the respective subtelomeric region if an unambiguous assignment based on DNA interaction data was possible (see Supplementary Information). Centromeres were assigned based on KKT2 ChIP–seq data.
Fig. 2
Fig. 2. Hi-C and ChIP–seq reveal partitioning of the T. brucei genome into distinct domains.
a, Hi-C heat maps of chromosomes 3 and 6 at 20-kb resolution. Horizontal blue, black and red lines mark heterozygous subtelomeric, homozygous core regions and BESs, respectively. Centromeres are marked by asterisks. b, Scatter plot showing inter-chromosomal interaction frequencies among centromeres (cen) (n = 206 bins; P = 0.0029), VSG genes in silent expression sites (VSGs) (n = 54 bins; P = 1.63 × 10−6) and rRNA genes (n = 40 bins; P = 0.0177) compared to a matching background sample, which was randomly selected from the interaction matrix (50-kb bin size). The background sample (grey) matches the genomic feature (red) in size and number. Selected bins with zero values were removed from both the query and background sample. P values are based on Welch’s t-test (two-sided). Black lines represent the mean. c, ChIP–seq data showing the enrichment (compared to input material) of the cohesin subunit SCC1 (n = 3 biologically independent experiments) across representative tRNA and rRNA genes (window size, 501 bp; step size, 101 bp). Black, red and blue boxes represent protein coding, tRNA and rRNA genes, respectively. Tick marks on the x axis represent 5-kb intervals. 3′B refers to one of the two alternative subtelomeric ends (A or B) at the 3′end of chromosome 11. d, ChIP–seq data showing cohesin (n = 3 biologically independent experiments), H3.V and H4.V (each in n = 2 biologically independent experiments) enrichment across three transcriptionally repressed BESs (window size, 2,001 bp; step size, 501 bp). Red flags mark BES promoters and black boxes indicate the locations of VSG genes.
Fig. 3
Fig. 3. Deletion of histone variants H3.V and H4.V leads to a switch in expression of VSG isoforms.
a, scRNA-seq analysis of wild-type (n = 40) and ΔH3.VΔH4.V (first time point, n = 44) cells. Each row represents data from one cell. The number of sequencing reads was normalized to account for differences in library size, gene length and uniqueness of VSG gene sequence. Only uniquely mapping reads were considered. The total number of VSG transcripts per cell is set to 100% (for details, see Methods, Extended Data Fig. 7). The colour code indicates the contribution of individual VSG transcripts to the pool of VSG transcripts in a single cell. The dominant VSG isoform is depicted with an orange border. For selected cells, the read coverage is shown across VSG-2, VSG-8 and VSG-11 (with 500 bp of surrounding sequence). CDS, coding sequence. b, Outline of the VSG switching mechanisms described for T. brucei. Green and red flags mark the active and repressed promoters, respectively. Green lines and grey bars indicate regions of expected transcription for the two different scenarios. c, Sequencing coverage across BES1 (left, top), BES15 (left, bottom) and a hybrid BES consisting of the 5′-BES1 and 3′-BES15 (right). Coverage is based on SMRT sequencing reads >10 kb from ΔH3.VΔH4.V gDNA that map to BES1 and BES15. The cross represents the site of recombination. Boxes represent expression-site-associated genes and ψ denotes a pseudogene. d, scRNA-seq-based analysis of ΔH3.VΔH4.V cells that exclusively express VSG-2 (n = 42) or VSG-11 (n = 82). The average transcript levels (counts per billion, cpb) based on uniquely mapping reads across BES1 and BES15 are shown. Grey bars represent degree of uniqueness. e, 4C-like inter-chromosomal interaction profiles (based on Hi-C-data, 20-kb bin size) showing the average interaction frequencies of BES1 (top) and BES15 (middle) with chromosomes 3, 6 and 8 in wild-type cells and the fold change (log2) in interactions of BES15 (bottom) with chromosomes 3, 6 and 8 after deletion of H3.V and H4.V.
Fig. 4
Fig. 4. Histone variants H3.V and H4.V influence global and local chromatin structures.
a, Hi-C heat map showing the fold change (log2) in DNA–DNA interaction frequency between wild-type and ΔH3.V (top two panels) and ΔH4.V cells (bottom two panels). b, Fold change in DNA–DNA interaction frequency among VSG genes located in BESs compared to background DNA–DNA interactions in wild-type, in ΔH4.V, ΔH3.V and ΔH3.VΔH4.V cells. The ratios for each cell line were calculated for 100 randomly selected background regions. Mean ± s.d. Significance was determined using Welch’s t-test (two-sided). c, FISH with probes against telomeric repeats (Alexa Fluor488, green), n = 2 biologically independent experiments. Scale bar, 5 μm. d, Quantification of telomere signal to determine the fraction of cells containing large telomeric clusters (white arrows in c) was performed using Imaris 8 and is based on the analysis of 1,128 cells. Means ± s.d. of two replicates are shown (wild type: n = 116, 221; ΔH3.V: n = 140, 102; ΔH4.V: n = 146, 107 and ΔH3.VΔH4.V: n = 190, 106). e, ATAC-seq data (n = 2 biologically independent experiments) across BES15 (repressed in wild-type cells). gDNA read coverage (bottom) is shown to illustrate mappability of reads (window size, 501 bp; step size, 101 bp). Red flag and black box indicate the position of the promoter and the VSG gene, respectively. Tick marks on x axes represent 20-kb intervals. f, Model illustrating the influence of H3.V and H4.V on genome architecture and local DNA accessibility. H3.V and H4.V single knockouts alone mediate only partial opening of BESs (half open arrow) and H3.V knockout leads to a spatial rearrangement of BESs inside the nucleus, whereas deletion of both histone variants is required to obtain the fully opened BESs (open arrow) and spatial proximity of BESs that facilitate recombination (red cross) and lead to the expression of a new VSG isoform.
Extended Data Fig. 1
Extended Data Fig. 1. Assembly of the T. brucei Lister 427 genome.
a, Outline of the genome-assembly strategy: gDNA of T. brucei Lister 427 was sequenced using SMRT sequencing technology and P6-C4 sequence chemistry. The 10% longest reads were error-corrected using the remaining SMRT reads and assembled into contigs using the HGAPv3 algorithm. Information on spatial contacts between contigs, obtained from Hi-C analyses, was used to position and orient the contigs into scaffolds. b, To scaffold and orient the contigs, Hi-C reads were mapped to 1,232 contigs to generate a heat map of DNA–DNA interactions (left). Scaffolding was performed by placing contigs such that the interaction signal located away from the diagonal could not be further reduced (right). Heterozygous subtelomeric regions displayed strong interactions with the chromosomal core region but not with other subtelomeric regions, which indicates that they belong to independent homologous chromosomes. Note that for the left arm of chromosome 7, the heterozygous subtelomeric regions of the two homologous chromosomes could not be assembled separately. c, Statistics of Hi-C data analysis based on reads mapped to a joined genome version (haploid A-forks joined to the core). This implies an underestimation of cis, and overestimation of trans interactions (marked with asterisks), as the B-forks remain un-joined.
Extended Data Fig. 2
Extended Data Fig. 2. Synteny between homologous chromosomes and between different isolates.
a, Pairwise comparison of corresponding homologous chromosomes using the Artemis Comparison Tool (ACT) of the Wellcome Trust Sanger Institute. Pairs of regions that share a high degree of similarity (BLAST score ≥ 5,000) are connected by boxes in red, or in blue if they are inverted. Chromosome 7 is not shown because the subtelomeric regions of the two homologous chromosomes are very similar and could not be resolved during the assembly. Chromosome 2 is not shown as only one of the two homologous chromosomes contains an extended subtelomeric region. b, Pairwise comparison of the eleven megabase-chromosomes of the TREU 927 isolate (middle black bar) and the corresponding two homologous chromosomes of the Lister 427 isolate (top and bottom black bars) using ACT. Regions that reached a BLAST score of at least 5,000 are drawn in red, or in blue if they are inverted.
Extended Data Fig. 3
Extended Data Fig. 3. Compartmentalization of megabase chromosomes in wild-type cells.
a, Hi-C heat maps of individual chromosomes at 20-kb resolution. Horizontal lines mark subtelomeric regions (blue), core regions (black) and bloodstream-form expression sites (red). A blue vertical line and an asterisk indicate the locations of centromeres. b, Hi-C heat map of the haploid genome with one set of subtelomeric regions joined to the core regions (20-kb resolution). c, Decay of frequency of intra-chromosomal contacts as a function of genomic distance (20-kb bin size) within subtelomeric (blue) and core (black) regions. The median across the core (n = 11) and subtelomeres (n = 32) is shown.
Extended Data Fig. 4
Extended Data Fig. 4. Hi-C and ChIP–seq reveal partitioning of the T. brucei genome into distinct domains.
a, Outline of the genome organization. Boundaries of transcription units are marked by nucleosomes that contain different types of histone variants. Black arrows indicate the direction of transcription. b, Scatter plot showing inter-chromosomal interactions among centromeres (n = 206 query bins, n = 292 background bins, P = 0.0029), VSG genes (n = 54 query bins, n = 130 background bins, P = 1.63 × 10−6), rRNA genes (n = 40 query bins, n = 64 background bins, P = 0.0177), tRNA genes (n = 614 query bins, n = 620 background bins, P = 2.45 × 10−190) and unidirectional transcription start sites (n = 3,142 query bins, n = 3,682 background bins, P = 6.49 × 10−91) compared to a background sample, which was randomly selected from the interaction matrix (50-kb bin size). The background sample matches the genomic feature in size and number. Selected bins with zero values were removed from both the query and background sample. P values are based on Welch’s t-test (two-sided). Black lines represent the mean. c, ChIP–seq data showing cohesin, H3.V and H4.V enrichment (compared to input material) averaged across all convergent transcription termination sites (cTTS, n = 51) (window size, 101 bp; step size, 11 bp).
Extended Data Fig. 5
Extended Data Fig. 5. Characterization of ΔH3.VΔH4.V cells.
a, RNA-seq fragment counts on H3.V and H4.V CDS in wild-type and ΔH3.VΔH4.V cells, normalized by million fragments mapped to protein-coding genes. Note that the first and last codon of the H3.V open reading frame were not deleted. As a result, a small number of H3.V reads are detected even in the ΔH3.V cells. b, Cell-cycle analysis based on flow cytometry, of wild-type and ΔH3.VΔH4.V cells. One of three replicates is shown. c, Growth curve (mean ± s.d.) of wild-type and ΔH3.VΔH4.V cells (n = 3 biologically independent replicates). d, RNA-seq of ΔH3.VΔH4.V cells (first and second time points). The mean ± s.d. fold change in expression compared to wild type (n = 3 biologically independent experiments) is shown, for the significantly regulated genes (based on a Benjamini–Hochberg adjusted P value from a two-sided Wald test with false discovery rate < 0.1) from different gene groups. ESAGs, expression-site-associated genes.
Extended Data Fig. 6
Extended Data Fig. 6. Analysis of ΔH3.VΔH4.V cells.
a, Order of cell analyses. b, scRNA-seq analyis of ΔH3.VΔH4.V cells, at the second time point. (n = 338). Each row represents data from one cell. For details, see Fig. 3a, Extended Data Fig. 7.
Extended Data Fig. 7
Extended Data Fig. 7. scRNA-seq quality control, VSG gene normalization and quantification.
a, Representative Bioanalyzer profiles (Agilent) of cDNA from 0 cells (n = 6) and 1 cell (n = 18, supplemented with ERCC spike-in control). b, Histogram representing the total number of genes expressed per single cell (wild type and ΔH3.VΔH4.V; n = 452). Cells with fewer than 500 genes (grey bars) were excluded from the analysis. c, Diagram representing quantification of expression of VSG genes, and the normalization procedure. The reads obtained in each single-cell library were mapped to the genome, keeping only the uniquely mapping reads (mapq > 0). Next, the number of reads mapping to each VSG gene was quantified. To account for differences in length and ‘uniqueness’ among the different VSG genes, the same procedure was performed with an in silico set of reads. The read counts to each VSG gene in each scRNA-seq assay were normalized for ‘uniqueness’ and gene length by dividing them by the counts obtained with the in silico dataset. Finally, for each cell the normalized read counts for each VSG gene were expressed as a percentage of the total number of normalized counts to VSG genes.
Extended Data Fig. 8
Extended Data Fig. 8. Mutually exclusive expression of VSG genes is not lost in ΔH3.V and ΔH4.V single-knockout cells.
a, Immunofluorescence imaging in wild-type, ΔH3.V, ΔH4.V and ΔH3.VΔH4.V cells (n = 1). Representative images of 26–28 stacks (0.1976-μm voxel size, maximum projection) are shown. Scale bar, 10 μm. b, Gating strategy used for all analyses. c, Flow cytometry analysis of VSG-2 expression in ΔH3.V, ΔH4.V and ΔH3.VΔH4.V cells. Wild-type cells were used as a VSG-2 positive control, and cells expressing VSG-13 were used as a negative control. ΔH3.V, n = 3; ΔH4.V, n = 3; ΔH3.VΔH4.V, n = 7 (measured at different time points). For each assay, 50,000 events were gated. d, Heterogeneity in expression of VSG genes, based on RNA-seq. The contributions of the dominant VSG-2 and two additional VSG genes found to be upregulated in ΔH3.VΔH4.V cells relative to the total VSG mRNAs are depicted. For each condition, mean mRNA levels and s.d. are derived from n = 3 biologically independent RNA-seq experiments.
Extended Data Fig. 9
Extended Data Fig. 9. DNA accessibility across BESs in ΔH3.VΔH4.V cells.
Uniquely mapping ATAC-seq reads across all BESs are shown. The 0-nt position corresponds to the promoter. Uniquely mapping gDNA-seq reads are shown to illustrate differences in mappability. For ATAC-seq, n = 2 biologically independent experiments were performed, using sample material from 10 million cells in one experiment and from 20 million cells in the other.

Comment in

Similar articles

See all similar articles

Cited by 15 articles

See all "Cited by" articles


    1. Deitsch KW, Lukehart SA, Stringer JR. Common strategies for antigenic variation by bacterial, fungal and protozoan pathogens. Nat. Rev. Microbiol. 2009;7:493–503. doi: 10.1038/nrmicro2145. - DOI - PMC - PubMed
    1. Hager GL, McNally JG, Misteli T. Transcription dynamics. Mol. Cell. 2009;35:741–753. doi: 10.1016/j.molcel.2009.09.005. - DOI - PMC - PubMed
    1. Misteli T, Soutoglou E. The emerging role of nuclear architecture in DNA repair and genome maintenance. Nat. Rev. Mol. Cell Biol. 2009;10:243–254. doi: 10.1038/nrm2651. - DOI - PMC - PubMed
    1. Lajoie BR, Dekker J, Kaplan N. The hitchhiker’s guide to Hi-C analysis: practical guidelines. Methods. 2015;72:65–75. doi: 10.1016/j.ymeth.2014.10.031. - DOI - PMC - PubMed
    1. Otto TD, et al. Long read assemblies of geographically dispersed Plasmodium falciparum isolates reveal highly structured subtelomeres. Wellcome Open Res. 2018;3:52. doi: 10.12688/wellcomeopenres.14571.1. - DOI - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources