Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Mar 17;11(3):e0151740.
doi: 10.1371/journal.pone.0151740. eCollection 2016.

SiteOut: An Online Tool to Design Binding Site-Free DNA Sequences

Affiliations

SiteOut: An Online Tool to Design Binding Site-Free DNA Sequences

Javier Estrada et al. PLoS One. .

Abstract

DNA-binding proteins control many fundamental biological processes such as transcription, recombination and replication. A major goal is to decipher the role that DNA sequence plays in orchestrating the binding and activity of such regulatory proteins. To address this goal, it is useful to rationally design DNA sequences with desired numbers, affinities and arrangements of protein binding sites. However, removing binding sites from DNA is computationally non-trivial since one risks creating new sites in the process of deleting or moving others. Here we present an online binding site removal tool, SiteOut, that enables users to design arbitrary DNA sequences that entirely lack binding sites for factors of interest. SiteOut can also be used to delete sites from a specific sequence, or to introduce site-free spacers between functional sequences without creating new sites at the junctions. In combination with commercial DNA synthesis services, SiteOut provides a powerful and flexible platform for synthetic projects that interrogate regulatory DNA. Here we describe the algorithm and illustrate the ways in which SiteOut can be used; it is publicly available at https://depace.med.harvard.edu/siteout/.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. SiteOut efficiently removes binding sites from a spacer between annotated enhancers.
Predicted binding sites for 11 transcription factors regulating expression of the gene Krüppel (Kr) in Drosophila melanogaster plotted across a 2kb region of its regulatory region. Nucleotide coordinates for the proximal and distal enhancers are respectively 2R:25224832..25226417 and 2R:25224052..25225758. Bar heights represent binding affinities. In the endogenous sequence between the two marked enhancers (labeled ‘spacer’) there are 131 binding sites for the transcription factors of interest. Sequences from orthogonal sources such as phage lambda have been used as “non-functional” spacers in many studies, but lambda DNA still contains many transcription factor binding sites. A computationaly generated random sequence with the same GC content as D.melanogaster intergenic sequence also contains a large number of binding sites. SiteOut creates a synthetic spacer that contains no binding sites, while keeping the flanking enhancers intact and the GC content constant.
Fig 2
Fig 2. Overview of the SiteOut algorithm.
(A) Schematic of the Monte Carlo algorithm. The non-deterministic nature of the process that drives the search towards sequences with fewer sites allows greater exploration of sequence space in order to overcome local minima in the number of binding sites (N). (B) Flowchart of the Monte Carlo algorithm highlighting the sequence acceptance/rejection process. Sm stands for a sequence in step m (m = i, i + 1, …), Nold and Nnew for the number of binding sites in the original and mutated sequence, respectively, and Pa for the acceptance probability. (C) An example of binding site identification and deletion. Initially, two sites are identified (red and green), and are removed by mutating two random nucleotides (pink). This creates a new site (blue), but reduces the total number of sites in the sequence, thus giving an acceptance probability (Pa) of 0.73. (D) Removing sites at the junction between a functional sequence and spacer by mutating only nucleotides from the spacer. (E) Performance plot for SiteOut running in Harvard Medical School’s cluster. Error bars come from different jobs being run in different nodes. Design of 300 bp random sequences, P value of 0.003. For 140 PWMs the 12 hour wall time is always reached.
Fig 3
Fig 3. Examples of ways in which SiteOut can be applied.
(A) Designing a synthetic enhancer. Binding motifs 1 and 2 are alternated forming a specific pattern to create a synthetic enhancer (bottom). The design.txt file shown in grey is the input given to SiteOut in this case. (B) Designing a synthetic gene locus. Three enhancers (top) are linked by 250 and 150 bp binding site-free sequences, and the whole construct is delimited by 300 and 200 bp site-free sequences. The design.txt file shown with grey background is the input given to SiteOut in the Spacer Designer option. (C) Removing binding sites in a hierarchical order if it is necessary to prioritize removal of particular sites. In this example we want to remove all red sites, most of the green and as many as possible of the blue. To do so, run SiteOut removing each binding site type one at a time, in reverse order of priority. In each step, the removal process may create new sites of a different type, but the most relevant ones are deleted in the subsequent steps. (D) Generating large binding site-free sequences by merging smaller ones. A 10 kb sequence (bottom) can be generated by merging 2 kb pieces made in parallel (top) and refining the resulting sequence (middle) to remove binding sites (black vertical bars) created at the junctions.

Similar articles

Cited by

References

    1. Spitz F, Furlong EEM. Transcription factors: from enhancer binding to developmental control. Nature Reviews Genetics. 2012. September;13(9):613–626. 10.1038/nrg3207 - DOI - PubMed
    1. Weingarten-Gabbay S, Segal E. The grammar of transcriptional regulation. Human Genetics. 2014;133(6):701–711. 10.1007/s00439-013-1413-1 - DOI - PMC - PubMed
    1. Perry MW, Boettiger AN, Bothma JP, Levine M. Shadow Enhancers Foster Robustness of Drosophila Gastrulation. Current Biology. 2010;20(17):1562–1567. 10.1016/j.cub.2010.07.043 - DOI - PMC - PubMed
    1. Akbari OS, Bae E, Johnsen H, Villaluz A, Wong D, Drewell Ra. A novel promoter-tethering element regulates enhancer-driven gene expression at the bithorax complex in the Drosophila embryo. Development (Cambridge, England). 2008;135(1):123–31. 10.1242/dev.010744 - DOI - PMC - PubMed
    1. Calhoun VC, Stathopoulos A, Levine M. Promoter-proximal tethering elements regulate enhancer-promoter specificity in the Drosophila Antennapedia complex. Proceedings of the National Academy of Sciences of the United States of America. 2002;99(14):9243–7. 10.1073/pnas.142291299 - DOI - PMC - PubMed

Publication types