Statistical analyses of the vibrational circular dichroism of selected proteins and relationship to secondary structures

Biochemistry. 1991 May 21;30(20):5089-103. doi: 10.1021/bi00234a036.

Abstract

The vibrational circular dichroism (VCD) spectra of 20 proteins dissolved in D2O are presented in the amide I' region. These data are decomposed into a linear combination of orthogonal subspectra generated by the principal component method of factor analysis, and the results for 13 of them are compared to their secondary structures as determined from X-ray crystallography. Factor analysis of the VCD yields six statistically significant subspectra that can be used to reproduce the spectra. Their coefficients can then be used to characterize a given protein. Comparison of cluster analyses of these VCD coefficients and of the secondary structure fractional coefficients from X-ray crystallography showed that proteins clustered in the VCD analysis were also clustered in the X-ray analysis. The relative fractions of alpha-helix and beta-sheet in the protein dominate the clustering in both data sets. Qualitative characterization of the secondary structure of a given protein is obtained from its clustering on the basis of spectral characteristics. A strong linear correlation was found between the coefficient of the second subspectrum and the alpha-helical fraction for the proteins studied. The second coefficient also correlated to the beta-sheet fraction, and the first coefficient weakly correlated to the fraction for "other". Subsequent multiple-parameter regression analyses of the VCD factor analysis coefficients, constrained to include only significant dependencies, yielded reliable determination of the alpha-helix fraction and somewhat less confident determination of beta-sheet, bend, and "other" components. Predictive capability for proteins not in the regression was good. Varimax rotation of the coefficients transformed the subspectra and gave simple correlations to secondary structure components but had less reliability and more restrictions than the multiple regression on the original coefficients. The partial least-squares analysis method was also used to predict fractional secondary structures for the training set proteins but resulted in somewhat higher average error, particularly for beta-sheet, than the multiple regression. The turn fraction was effectively undetermined in both the regression and partial least-squares analyses. These statistical analyses represent the first determination of a quantitative relationship between VCD spectra and secondary structure in proteins.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Circular Dichroism
  • Cluster Analysis
  • Enzymes / chemistry
  • Mathematics
  • Models, Theoretical
  • Protein Conformation*
  • Proteins / chemistry*
  • Regression Analysis
  • Vibration
  • X-Ray Diffraction

Substances

  • Enzymes
  • Proteins