Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Jul 1;34(Web Server issue):W225-30.
doi: 10.1093/nar/gkl121.

sgTarget: A Target Selection Resource for Structural Genomics

Affiliations
Free PMC article

sgTarget: A Target Selection Resource for Structural Genomics

Ana P C Rodrigues et al. Nucleic Acids Res. .
Free PMC article

Abstract

sgTarget (http://www.ysbl.york.ac.uk/sgTarget) is a web-based resource to aid the selection and prioritization of candidate proteins for structure determination. The system annotates user submitted gene or protein sequences, identifying sequence families with no homologues of known structure, and characterizing each protein according to a range of physicochemical properties that may affect its expression, solubility and likelihood to crystallize. Summaries of these analyses are available for individual sequences, as well as whole datasets. This type of analysis enables structural biologists to iteratively select targets from their genomic sequences of interest and according to their research needs. All sequence datasets submitted to sgTarget are available for users to select and rank using their choice of criteria. sgTarget was developed to support individual laboratories collaborating in structural and functional genomics projects and should be valuable to structural biologists wishing to employ the wealth of available genome sequences in their structural quests.

Figures

Figure 1
Figure 1
Target page with Select function activated. The menu area (on the left) allows users to choose one or more sequence datasets to target. The work area (on the right) allows users to specify selection criteria. In this example, the Mycoplasma genitalium genome has been chosen for targeting. The selection criteria specify that genes must have a GC content and CAI that is optimal for E.coli, and proteins have no homologues with known structure, are likely to be stable, viable in E.coli for at least 2 h, have at most one transmembrane region, and no fibrous or disordered regions (sgTarget's default selection criteria). When users click the OK button they are presented with the Rank function, and asked to choose how the target list should be prioritized and displayed (shown in Figure 2).
Figure 2
Figure 2
Target page with Rank function activated. The menu area (on the left) shows a summary of the results returned by the Select function. The work area (on the right) allows users to specify which data to display for the selected targets, and how to rank those targets by specifying the priority of each annotation. Users can choose to view the prioritized target list as a Web page (by clicking the HTML button) or, alternatively, as a tabbed text file (by clicking the TEXT button). In this case, 49 targets were selected with the criteria specified in Figure 1. The target list is to be ranked with decreasing coverage by NRDB database (i.e. proteins with more of their length annotated as similar to a protein in the NRDB database have higher priority) and a number of protein physicochemical properties are to be displayed along with the default attributes (off the screen in this screenshot) (see Figure 3 for resulting page).
Figure 3
Figure 3
Target page showing a target list. The selected targets are ranked according to the order and priority specified for the different annotations, and a table of prioritized targets is built using the annotations that were chosen for display. In this case, a list of 49 targets (selected from M.genitalium's genome with the criteria specified in Figure 1) was ranked by decreasing coverage by NRDB database proteins, and a table constructed showing the target's identifier (in sgTarget), accession number, name, molecular weight, length, GRAVY score, isoelectric point, coverage by NRDB database proteins (including the span of the alignments on the target and the top taxonomic group which encompasses all reported alignments) and function annotation (the top InterPro hit and its GO high-level molecular function) (as specified in Figure 2).
Figure 4
Figure 4
P.falciparum annotation wheel, with an emphasis on structural annotation. Annotations are displayed anti-clockwise as follows: A total of 1055 proteins have structural annotations, 691 high-confidence and 364 low-confidence (PDB SEQRES, release 05/2002); Of the remaining proteins, 3714 are likely to be intractable: 1475 have transmembrane regions, a further 2131 have disordered regions and the other 108 have fibrous regions; For the remainder of the proteome, 187 proteins have function annotations, although only 97 of these are classified by GO; Most other proteins are found in other organisms (295), except for 16 ORFan proteins.

Similar articles

See all similar articles

Cited by 3 articles

References

    1. Brenner S.E. Target selection for structural genomics. Nature Struct. Biol. 2000;7:967–969. - PubMed
    1. Rodrigues A., Hubbard R.E. Making decisions for structural genomics. Brief Bioinform. 2003;4:150–167. - PubMed
    1. Frishman D., Mokrejs M., Kosykh D., Kastenmuller G., Kolesov G., Zubrzycki I., Gruber C., Geier B., Kaps A., Albermann K., et al. The PEDANT genome database. Nucleic Acids Res. 2003;31:207–211. - PMC - PubMed
    1. Fleming K., Muller A., MacCallum R.M., Sternberg M.J. 3D-GENOMICS: a database to compare structural and functional annotations of proteins between sequenced genomes. Nucleic Acids Research 32:D245-D250. Nucleic Acids Res. 2004;32:D245–D250. - PMC - PubMed
    1. Rodrigues A.P.C. York, UK: University of York; 2004. Target Selection in Structural Genomics. PhD Thesis.

Publication types

Feedback