Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul 18:7:1105.
doi: 10.12688/f1000research.14541.1. eCollection 2018.

drawProteins: a Bioconductor/R package for reproducible and programmatic generation of protein schematics

Affiliations

drawProteins: a Bioconductor/R package for reproducible and programmatic generation of protein schematics

Paul Brennan. F1000Res. .

Abstract

Protein schematics are valuable for research, teaching and knowledge communication. However, the tools used to automate the process are challenging. The purpose of the drawProteins package is to enable the generation of schematics of proteins in an automated fashion that can integrate with the Bioconductor/R suite of tools for bioinformatics and statistical analysis. Using UniProt accession numbers, the package uses the UniProt API to get the features of the protein from the UniProt database. The features are assembled into a data frame and visualized using adaptations of the ggplot2 package. Visualizations can be customised in many ways including adding additional protein features information from other data frames, altering colors and protein names and adding extra layers using other ggplot2 functions. This can be completed within a script that makes the workflow reproducible and sharable.

Keywords: BIOCONDUCTOR; R package; protein; schematic; visualization..

PubMed Disclaimer

Conflict of interest statement

No competing interests were disclosed.

Figures

Figure 1.
Figure 1.. Protein domain schematic of RelA/p65.
The default output gives a grey background and labels the domain. RHD = Rel Homology Domain.
Figure 2.
Figure 2.. Protein domain schematic of RelA/p65.
The background can be customized using theme functions from ggplot2. RHD = Rel Homology Domain.
Figure 3.
Figure 3.. More detailed protein domain schematic of RelA/p65.
By drawing the domains, regions and motifs a more detailed protein schematic is generated. RHD = Rel Homology Domain; TAD = Transactivation Domain. Yellow circles denote phosphorylation sites.
Figure 4.
Figure 4.. Protein domain schematic of human NF-kappaB proteins.
The five members of the NF-kappaB transcription factors family can be illustrated by drawing the domains, regions and motifs as detailed on the UniProt database. The lengths of the chains, domains and motifs are proportional to the number of amino acids. RHD = Rel Homology Domain; TAD = Transactivation Domain. Yellow circles denote phosphorylation sites.
Figure 5.
Figure 5.. Protein domain schematic of human MAP kinases.
Using bioMart with the Gene Ontology term for "MAP kinase activity", it is possible to draw multiple human MAP kinases using data from UniProt. Yellow circles denote phosphorylation sites.
Figure 6.
Figure 6.. Customizing the protein schematic of the NF-kappaB family.
Using arguments in the draw_chains() and draw_phospho() functions, it is possible customize colors and labels.

Similar articles

Cited by

References

    1. Ren J, Wen L, Gao X, et al. : DOG 1.0: illustrator of protein domain structures. Cell Res. 2009;19(2):271–3. 10.1038/cr.2009.6 - DOI - PubMed
    1. Liu W, Xie Y, Ma J, et al. : IBS: an illustrator for the presentation and visualization of biological sequences. Bioinformatics. 2015;31(20):3359–3361. 10.1093/bioinformatics/btv362 - DOI - PMC - PubMed
    1. Corpas M: The BioJS article collection of open source components for biological data visualisation [version 1; referees: not peer reviewed]. F1000Res. 2014;3:56. 10.12688/f1000research.3-56.v1 - DOI - PMC - PubMed
    1. Skidmore ZL, Wagner AH, Lesurf R, et al. : GenVisR: Genomic Visualizations in R. Bioinformatics. 2016;32(19):3012–3014. 10.1093/bioinformatics/btw325 - DOI - PMC - PubMed
    1. The UniProt Consortium: Uniprot: the universal protein knowledgebase. Nucleic Acids Res. 2017;45(D1):D158–D169. 10.1093/nar/gkw1099 - DOI - PMC - PubMed

Publication types

Grants and funding

PB has been supported by funding from Bloodwise, UK.

LinkOut - more resources