Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Dec 1;30(23):3293-301.
doi: 10.1093/bioinformatics/btu534. Epub 2014 Aug 18.

Comparative Assembly Hubs: Web-Accessible Browsers for Comparative Genomics

Affiliations
Free PMC article

Comparative Assembly Hubs: Web-Accessible Browsers for Comparative Genomics

Ngan Nguyen et al. Bioinformatics. .
Free PMC article

Abstract

Motivation: Researchers now have access to large volumes of genome sequences for comparative analysis, some generated by the plethora of public sequencing projects and, increasingly, from individual efforts. It is not possible, or necessarily desirable, that the public genome browsers attempt to curate all these data. Instead, a wealth of powerful tools is emerging to empower users to create their own visualizations and browsers.

Results: We introduce a pipeline to easily generate collections of Web-accessible UCSC Genome Browsers interrelated by an alignment. It is intended to democratize our comparative genomic browser resources, serving the broad and growing community of evolutionary genomicists and facilitating easy public sharing via the Internet. Using the alignment, all annotations and the alignment itself can be efficiently viewed with reference to any genome in the collection, symmetrically. A new, intelligently scaled alignment display makes it simple to view all changes between the genomes at all levels of resolution, from substitutions to complex structural rearrangements, including duplications. To demonstrate this work, we create a comparative assembly hub containing 57 Escherichia coli and 9 Shigella genomes and show examples that highlight their unique biology.

Availability and implementation: The source code is available as open source at: https://github.com/glennhickey/progressiveCactus The E.coli and Shigella genome hub is now a public hub listed on the UCSC browser public hubs Web page.

Figures

Fig. 1.
Fig. 1.
An example E.coli comparative assembly hub with E.coli K12 MG1655 as the reference browser. The top browser screenshot (a) shows an ∼900 kb region with a known large inversion (light red) in the closely related strain K12 W3110, which is flanked by homologous (with opposite orientations) ribosomal RNA operons rrnD and rrnE [Hayashi et al. (2006); Hill and Harnish (1981)], and is the result of recombination between them. (b–c) Zoom-in of the K12 W3110 inversion left and right boundaries, respectively, showing operon rrnE of K12 W3110 (‘K12_W3110 RNA’ track, in green, which is K12 W3110 ncRNA annotation track lifted-over to K12 MG1655) aligned to operon rrnD of K12 MG1655 (‘K12_MG1655 RNA’ track, also in green) on the left and operon rrnD of K12 W3110 aligned to operon rrnE of K12 MG1655 on the right. If further zoomed in (d), SNPs and query insertions are visible. The text on the screenshots was adjusted for better readability
Fig. 2.
Fig. 2.
A browser screenshot showing the pdc-adhB-cat tandem repeat region of E.coli KO11FL 162099 (Turner et al. 2012) displayed along the genome of E.coli KO11FL 52593. The colored horizontal bars on top of each snake track indicate duplications in KO11FL 52593 [two copies of each gene pflA (green), pflB-L (green) and pflB-S (orange)]. There is a large deletion in the parent strain W 162011, as this strain does not contain the pdc-adhB-cat insert. Following the snake track of KO11FL 162099, there are 20 copies of (pflA, pflB-L, cat, adhB, pdc and pdfB-S). As KO11FL 52593 has two copies of pflA, pflB-L and pflB-S the display arbitrarily picks one copy of each to map corresponding KO11FL 162099 orthologous genes to. The text on the screenshot was adjusted for better readability
Fig. 3.
Fig. 3.
An example portion of a comparative assembly hub configuration Web page. Each browser in the hub has its own such equivalent configuration page. Using the grid layout (rows represent the genomes, columns represent the track types), alignments and annotations can be selected regardless of which genome they were originally described on. The inset phylogenetic tree is generated automatically by the comparative assembly hub pipeline. The track controls above the grid allow quick overall configuration. Fine-grained track controls (not shown) are provided at the bottom of the page
Fig. 4.
Fig. 4.
The E.coli/Shigella core genome browser, showing the highly conserved ordering relationships between blocks of the E.coli core genome and the less conserved ordering in Shigella. Most E.coli look like the first snake track (E24377), with no high-level rearrangements (for space only one is shown). In contrast, the Shigella has, with respect to E.coli, a fragmented core genome (second snake track SbCDC 308394, again, only one shown for lack of space)
Fig. 5.
Fig. 5.
Browser querying time with and without LODs

Similar articles

  • The UCSC Genome Browser database: 2015 update.
    Rosenbloom KR, Armstrong J, Barber GP, Casper J, Clawson H, Diekhans M, Dreszer TR, Fujita PA, Guruvadoo L, Haeussler M, Harte RA, Heitner S, Hickey G, Hinrichs AS, Hubley R, Karolchik D, Learned K, Lee BT, Li CH, Miga KH, Nguyen N, Paten B, Raney BJ, Smit AF, Speir ML, Zweig AS, Haussler D, Kuhn RM, Kent WJ. Rosenbloom KR, et al. Nucleic Acids Res. 2015 Jan;43(Database issue):D670-81. doi: 10.1093/nar/gku1177. Epub 2014 Nov 26. Nucleic Acids Res. 2015. PMID: 25428374 Free PMC article.
  • MakeHub: Fully Automated Generation of UCSC Genome Browser Assembly Hubs.
    Hoff KJ. Hoff KJ. Genomics Proteomics Bioinformatics. 2019 Oct;17(5):546-549. doi: 10.1016/j.gpb.2019.05.003. Epub 2020 Jan 28. Genomics Proteomics Bioinformatics. 2019. PMID: 32001327 Free PMC article.
  • SynMap2 and SynMap3D: web-based whole-genome synteny browsers.
    Haug-Baltzell A, Stephens SA, Davey S, Scheidegger CE, Lyons E. Haug-Baltzell A, et al. Bioinformatics. 2017 Jul 15;33(14):2197-2198. doi: 10.1093/bioinformatics/btx144. Bioinformatics. 2017. PMID: 28334338
  • A brief introduction to web-based genome browsers.
    Wang J, Kong L, Gao G, Luo J. Wang J, et al. Brief Bioinform. 2013 Mar;14(2):131-43. doi: 10.1093/bib/bbs029. Epub 2012 Jul 3. Brief Bioinform. 2013. PMID: 22764121 Review.
  • The UCSC Genome Browser: What Every Molecular Biologist Should Know.
    Mangan ME, Williams JM, Kuhn RM, Lathe WC 3rd. Mangan ME, et al. Curr Protoc Mol Biol. 2014 Jul 1;107:19.9.1-36. doi: 10.1002/0471142727.mb1909s107. Curr Protoc Mol Biol. 2014. PMID: 24984850 Free PMC article. Review.
See all similar articles

Cited by 11 articles

See all "Cited by" articles

Publication types

Feedback