TESE: generating specific protein structure test set ensembles

Bioinformatics. 2008 Nov 15;24(22):2632-3. doi: 10.1093/bioinformatics/btn488. Epub 2008 Sep 16.

Abstract

TESE is a web server for the generation of test sets of protein sequences and structures fulfilling a number of different criteria. At least three different use cases can be envisaged: (i) benchmarking of novel methods; (ii) test sets tailored for special needs and (iii) extending available datasets. The CATH structure classification is used to control structural/sequence redundancy and a variety of structural quality parameters can be used to interactively select protein subsets with specific characteristics, e.g. all X-ray structures of alpha-helical repeat proteins with more than 120 residues and resolution <2.0 A. The output includes FASTA-formatted sequences, PDB files and a clickable HTML index file containing images of the selected proteins. Multiple subsets for cross-validation are also supported.

Availability: The TESE server is available for non-commercial use at URL: http://protein.bio.unipd.it/tese/.

MeSH terms

  • Computational Biology*
  • Databases, Protein*
  • Internet
  • Models, Molecular
  • Protein Conformation*
  • Proteins / analysis*
  • Proteins / chemistry*

Substances

  • Proteins