Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2002 Jun;12(6):996-1006.
doi: 10.1101/gr.229102.

The Human Genome Browser at UCSC

Affiliations
Free PMC article

The Human Genome Browser at UCSC

W James Kent et al. Genome Res. .
Free PMC article

Abstract

As vertebrate genome sequences near completion and research refocuses to their analysis, the issue of effective genome annotation display becomes critical. A mature web tool for rapid and reliable display of any requested portion of the genome at any scale, together with several dozen aligned annotation tracks, is provided at http://genome.ucsc.edu. This browser displays assembly contigs and gaps, mRNA and expressed sequence tag alignments, multiple gene predictions, cross-species homologies, single nucleotide polymorphisms, sequence-tagged sites, radiation hybrid data, transposon repeats, and more as a stack of coregistered tracks. Text and sequence-based searches provide quick and precise access to any region of specific interest. Secondary links from individual features lead to sequence details and supplementary off-site databases. One-half of the annotation tracks are computed at the University of California, Santa Cruz from publicly available sequence data; collaborators worldwide provide the rest. Users can stably add their own custom tracks to the browser for educational or research purposes. The conceptual and technical framework of the browser, its underlying MYSQL database, and overall use are described. The web site currently serves over 50,000 pages per day to over 3000 different users.

Figures

Figure 1
Figure 1
Part of the HOXA cluster as viewed in the University of California, Santa Cruz (UCSC) genome browser. The shortcut bar in blue provides quick access to BLAT searches, the DNA sequence, the annotations as text tables, earlier or later assemblies the genome, the corresponding NCBI and Ensembl views, and the user's guide. The controls directly beneath position the browser over a specific region in the genome. The large white picture in the middle displays various annotations. At the bottom are controls for fine-tuning the display and for the individual tracks. Only the first 15 of 31 available tracks are shown here. This region contains three known genes that are all transcribed on the reverse strand as indicated by the arrowheads in the introns. Note the alternative splicing of HOXA1 in the Human RNA track. The Spliced EST track indicates that there is active transcription of a region between HOXA1 and HOXA2. Expressed sequence tag evidence for the presence of additional nonannotated genes in well studied regions like this often can be found using the UCSC browser. The Mouse Blat track indicated a high level of conservation between mouse and human in this region. Both the Mouse Blat and the Exofish ecores are based on translated alignments, but in highly conserved regions such as this it is not unusual for even translated alignments to paint conserved noncoding regions. The noncoding regions have diverged considerably more between human and pufferfish than between human and mouse.
Figure 2
Figure 2
All of chromosome 17. Generally, people work at smaller scales than this, but the browser is capable of displaying all of the annotations on a chromosome in a reasonable time. The centromere is depicted in red in the chromosome band track. The coverage track shows finished regions in black and draft regions in various shades of gray depending on the depth of coverage. There are two large gene deserts in chromosome bands q22 and q24.3. Tracks based on mRNAs, ESTs, and homology with Tetraodon all are quite sparse in these regions, though there is still quite a bit of mouse homology.
Figure 3
Figure 3
Chromosome 17 band q21.32. This region spans several million bases and is covered by a mix of finished and draft clones. The large blocks in the gap track indicate gaps between clones, while the small ticks indicate gaps within draft clones. Where there is evidence for the relative order and orientation of the contigs on either side of a gap, a white line is drawn though the gap. Most of the contigs in this region are ordered. At this scale, it is possible to resolve most individual genes but not necessarily individual exons.
Figure 4
Figure 4
One million bases in the middle of 17q21.32. This is a scale frequently used when trying to positionally clone a gene. Many of the genes in this region are already known, but the EST, mouse, and fish homology evidence suggest the presence of additional genes as well, particularly between ITGB3 and NPEPPS.
Figure 5
Figure 5
A known gene and an unknown gene or two. ITGB3, the integrin β chain, β 3 precursor is on the left. To the right is a relatively small gene, C17001176, predicted by the Fgenesh++ program, which is supported by mouse and fish homology. Between ITGB3 and C17001176 is a region quite likely to contain another gene judging by the EST and mouse homology evidence.
Figure 6
Figure 6
Details page on the known gene VLDLR.
Figure 7
Figure 7
Binning scheme for optimizing database accesses for genomic annotations that cover a particular region of the genome. This diagram shows bins of three different sizes. Features are put in the smallest bin in which they fit. A feature covering the range indicated by line A would go in bin 1. Similarly, line B goes in bin 4 and line C in bin 20. When the browser needs to access features in a region, it must look in bins of all different sizes. To access all the features that overlapped or were enclosed by line A, the browser looks in bins 1, 2, 3, 7, 8, 9, 10, and 11. For B the browser looks in bins 1, 4, 14, 15, 16, 17. For C, the browser looks in bins 1, 5, and 20.

Similar articles

  • The UCSC Genome Browser.
    Karolchik D, Hinrichs AS, Kent WJ. Karolchik D, et al. Curr Protoc Bioinformatics. 2009 Dec;Chapter 1:Unit1.4. doi: 10.1002/0471250953.bi0104s28. Curr Protoc Bioinformatics. 2009. PMID: 19957273 Free PMC article.
  • The UCSC Genome Browser.
    Karolchik D, Hinrichs AS, Kent WJ. Karolchik D, et al. Curr Protoc Bioinformatics. 2012 Dec;Chapter 1:Unit1.4. doi: 10.1002/0471250953.bi0104s40. Curr Protoc Bioinformatics. 2012. PMID: 23255150
  • The UCSC Genome Browser.
    Karolchik D, Hinrichs AS, Kent WJ. Karolchik D, et al. Curr Protoc Bioinformatics. 2007 Mar;Chapter 1:Unit 1.4. doi: 10.1002/0471250953.bi0104s17. Curr Protoc Bioinformatics. 2007. PMID: 18428780
  • UCSC genome browser tutorial.
    Zweig AS, Karolchik D, Kuhn RM, Haussler D, Kent WJ. Zweig AS, et al. Genomics. 2008 Aug;92(2):75-84. doi: 10.1016/j.ygeno.2008.02.003. Epub 2008 Jun 2. Genomics. 2008. PMID: 18514479 Review.
  • The genome browser at UCSC for locating genes, and much more!
    Bina M. Bina M. Mol Biotechnol. 2008 Mar;38(3):269-75. doi: 10.1007/s12033-007-9019-2. Epub 2007 Dec 4. Mol Biotechnol. 2008. PMID: 18058261 Review.
See all similar articles

Cited by 3,728 articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback