Protein sectors: statistical coupling analysis versus conservation

PLoS Comput Biol. 2015 Feb 27;11(2):e1004091. doi: 10.1371/journal.pcbi.1004091. eCollection 2015 Feb.

Abstract

Statistical coupling analysis (SCA) is a method for analyzing multiple sequence alignments that was used to identify groups of coevolving residues termed "sectors". The method applies spectral analysis to a matrix obtained by combining correlation information with sequence conservation. It has been asserted that the protein sectors identified by SCA are functionally significant, with different sectors controlling different biochemical properties of the protein. Here we reconsider the available experimental data and note that it involves almost exclusively proteins with a single sector. We show that in this case sequence conservation is the dominating factor in SCA, and can alone be used to make statistically equivalent functional predictions. Therefore, we suggest shifting the experimental focus to proteins for which SCA identifies several sectors. Correlations in protein alignments, which have been shown to be informative in a number of independent studies, would then be less dominated by sequence conservation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Computational Biology / methods*
  • Conserved Sequence
  • PDZ Domains
  • Protein Interaction Domains and Motifs / physiology*
  • Proteins / chemistry*
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*
  • Tetrahydrofolate Dehydrogenase

Substances

  • Proteins
  • Tetrahydrofolate Dehydrogenase

Grant support

TT was supported by a Charles L. Brown Membership at the Institute for Advanced Study. LJC was supported by an Engineering and Physical Sciences Research Council Fellowship (EP/H028064/2). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.