The PAS fold. A redefinition of the PAS domain based upon structural prediction

Eur J Biochem. 2004 Mar;271(6):1198-208. doi: 10.1111/j.1432-1033.2004.04023.x.

Abstract

In the postgenomic era it is essential that protein sequences are annotated correctly in order to help in the assignment of their putative functions. Over 1300 proteins in current protein sequence databases are predicted to contain a PAS domain based upon amino acid sequence alignments. One of the problems with the current annotation of the PAS domain is that this domain exhibits limited similarity at the amino acid sequence level. It is therefore essential, when using proteins with low-sequence similarities, to apply profile hidden Markov model searches for the PAS domain-containing proteins, as for the PFAM database. From recent 3D X-ray and NMR structures, however, PAS domains appear to have a conserved 3D fold as shown here by structural alignment of the six representative 3D-structures from the PDB database. Large-scale modelling of the PAS sequences from the PFAM database against the 3D-structures of these six structural prototypes was performed. All 3D models generated (> 5700) were evaluated using prosaii. We conclude from our large-scale modelling studies that the PAS and PAC motifs (which are separately defined in the PFAM database) are directly linked and that these two motifs form the PAS fold. The existing subdivision in PAS and PAC motifs, as used by the PFAM and SMART databases, appears to be caused by major differences in sequences in the region connecting these two motifs. This region, as has been shown by Gardner and coworkers for human PAS kinase (Amezcua, C.A., Harper, S.M., Rutter, J. & Gardner, K.H. (2002) Structure 10, 1349-1361, [1]), is very flexible and adopts different conformations depending on the bound ligand. Some PAS sequences present in the PFAM database did not produce a good structural model, even after realignment using a structure-based alignment method, suggesting that these representatives are unlikely to have a fold resembling any of the structural prototypes of the PAS domain superfamily.

MeSH terms

  • Amino Acid Sequence
  • Arabidopsis Proteins / chemistry
  • Arabidopsis Proteins / genetics
  • Bacterial Proteins / chemistry
  • Bacterial Proteins / genetics
  • Caenorhabditis elegans Proteins / chemistry
  • Caenorhabditis elegans Proteins / genetics
  • Databases, Protein
  • Models, Molecular*
  • Molecular Sequence Data
  • Protein Folding*
  • Protein Structure, Secondary
  • Protein Structure, Tertiary*
  • Proteins / chemistry*
  • Proteins / genetics
  • Sequence Alignment
  • Structural Homology, Protein

Substances

  • Arabidopsis Proteins
  • Bacterial Proteins
  • Caenorhabditis elegans Proteins
  • Proteins