Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb 4;11(1):690.
doi: 10.1038/s41467-020-14495-7.

Bridging non-overlapping reads illuminates high-order epistasis between distal protein sites in a GPCR

Affiliations

Bridging non-overlapping reads illuminates high-order epistasis between distal protein sites in a GPCR

Justin I Yoo et al. Nat Commun. .

Abstract

Epistasis emerges when the effects of an amino acid depend on the identities of interacting residues. This phenomenon shapes fitness landscapes, which have the power to reveal evolutionary paths and inform evolution of desired functions. However, there is a need for easily implemented, high-throughput methods to capture epistasis particularly at distal sites. Here, we combine deep mutational scanning (DMS) with a straightforward data processing step to bridge reads in distal sites within genes (BRIDGE). We use BRIDGE, which matches non-overlapping reads to their cognate templates, to uncover prevalent epistasis within the binding pocket of a human G protein-coupled receptor (GPCR) yielding variants with 4-fold greater affinity to a target ligand. The greatest functional improvements in our screen result from distal substitutions and substitutions that are deleterious alone. Our results corroborate findings of mutational tolerance in GPCRs, even in conserved motifs, but reveal inherent constraints restricting tolerated substitutions due to epistasis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Diagram of Illumina sequencing mechanism underlying BRIDGE.
a New methods are needed to engineer key residues in active sites (e.g., ligand-binding domains) while accounting for epistatic interactions. b BRIDGE can be used to extensively quantify the enrichment of variants containing mutations at distal sites. In order to match nonoverlapping paired-end reads, we leverage the Illumina sequencing mechanism wherein reads generated from either end of the same DNA strand produce fluorescent signals at the same (x, y) coordinate on the sequencing chip surface. By matching nonoverlapping read pairs using their surface coordinates, BRIDGE directly accounts for epistasis between distal residues in experiments reliant on next-generation sequencing. c High sequencing accuracy is obtained by designing NGS experiments such that variable regions are focused towards the 5′-end of each read, where error rates are low.
Fig. 2
Fig. 2. Diagram of FACS-based GPCR directed evolution and screen.
a A library of the human adenosine A2a receptor (A2aR) was generated using degenerate NNS codons to perform saturation mutagenesis at five sites within the ligand-binding pocket. Yeast are transformed with the A2aR library and incubated with fluorescent ligand to couple ligand-binding affinity to cellular fluorescence intensity. b The yeast library is screened using fluorescence-activated cell sorting (FACS) to enrich cells producing A2aR variants with high fluorescence intensities. This process is repeated to further enrich receptors with improved ligand-binding affinities while depleting the population of variants with low affinities.
Fig. 3
Fig. 3. BRIDGE reveals highly enriched A2aR variants due to epistasis between distal sites.
a A histogram of the log10(count) of unique variants sharing a given enrichment rate demonstrates stringent, progressive enrichment of few highly enriched mutants from the naive library to the post sort (PS) 1 (gray) and PS4 (black) libraries. A small number of variants are highly enriched in PS4 (e.g., GRIA), while variants with wild-type (WT)-like behavior are not expected to be highly enriched. A positive control (Q89A) and the most abundant variant in PS4 (Q89S) also demonstrate relatively high enrichment as expected. b The top 20 variants containing at least two distal, mutated residues (XXXXX) exhibit far greater enrichment rates compared to the top 20 variants containing proximal triple (TQXXX), proximal double (e.g., XXWLH), or single (e.g., XQWLH) substitutions. c Substitutions that are deleterious in isolation are common in the 20 most highly enriched A2aR variants containing mutations in distal sites. Residues that are beneficial or deleterious as single substitutions are colored according to their log2-fold enrichment, where the color spectrum is centered around the wild-type enrichment rate. For example, variant SLNIG is enriched 791-fold, but the single T88S (SQWLH) substitution is deleterious. Wild-type residues have a white background. *Residue abundance between 7 and 11 in the naive library, **not detected in PS4.
Fig. 4
Fig. 4. Radioligand binding confirms improved ligand-binding affinities.
a, b Compared with wild-type A2aR, highly enriched variants, Q89A/S, GRIA, and SLNIG bind [3H]-NECA with fourfold greater affinity (Kd). c Functional yields (Bmax) of the highly enriched variants are 1.2–1.7-fold of the wild-type receptor yield, suggesting improvements in ligand-binding affinity rather than functional yield dictate variant enrichment. Data represent the mean of three biological replicates, and error bars represent their standard deviation. Source Data are provided as a Source Data file.
Fig. 5
Fig. 5. Network analysis of enriched variants reveals rare paths to highly enriched variants.
a A network diagram comprising Post Sort (PS) 4 variants with enrichment rate ≥1. Each node corresponds to a unique variant, and the node’s radius scales with the variant’s enrichment rate. Each edge represents a difference in one amino acid, and the color of each edge is the same as the variant with lower enrichment, providing a sense of the trajectory taken by Darwinian evolution, or from less fit to more fit. As anticipated, variants with the greatest fitness are rare and often accessible by only few paths (edges), reflecting a rugged sequence–function landscape indicative of epistasis. b Isolated networks of variants SLNIG and TSWIH underscore the scarcity of paths originating from disconnected functional variants, which suggests that specific combinations of residues lead to pronounced improvements in function rather than a few critical residues. This observation is indicative of epistasis, where the context (i.e., background sequence) of a protein dictates an individual residue’s effect on the protein’s phenotype. c In contrast, TSWLH is accessible by numerous paths reflecting a broad sequence–function landscape. Parent variants leading to TSWLH are highly interconnected as several related proteins benefit similarly from the same substitution.

Similar articles

Cited by

References

    1. Romero PA, Arnold FH. Exploring protein fitness landscapes by directed evolution. Nat. Rev. Mol. Cell Biol. 2009;10:866–876. doi: 10.1038/nrm2805. - DOI - PMC - PubMed
    1. Fowler DM, Fields S. Deep mutational scanning: a new style of protein science. Nat. Methods. 2014;11:801–807. doi: 10.1038/nmeth.3027. - DOI - PMC - PubMed
    1. Sailer ZR, Harms MJ. Molecular ensembles make evolution unpredictable. Proc. Natl Acad. Sci. USA. 2017;114:11938–11943. doi: 10.1073/pnas.1711927114. - DOI - PMC - PubMed
    1. Podgornaia AI, Laub MT. Pervasive degeneracy and epistasis in a protein-protein interface. Science. 2015;347:673–677. doi: 10.1126/science.1257360. - DOI - PubMed
    1. Weinreich DM, Delaney NF, DePristo MA, Hartl DL. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science. 2006;312:111–114. doi: 10.1126/science.1123539. - DOI - PubMed

Publication types