Systematic detection of internal symmetry in proteins using CE-Symm

J Mol Biol. 2014 May 29;426(11):2255-68. doi: 10.1016/j.jmb.2014.03.010. Epub 2014 Mar 26.


Symmetry is an important feature of protein tertiary and quaternary structures that has been associated with protein folding, function, evolution, and stability. Its emergence and ensuing prevalence has been attributed to gene duplications, fusion events, and subsequent evolutionary drift in sequence. This process maintains structural similarity and is further supported by this study. To further investigate the question of how internal symmetry evolved, how symmetry and function are related, and the overall frequency of internal symmetry, we developed an algorithm, CE-Symm, to detect pseudo-symmetry within the tertiary structure of protein chains. Using a large manually curated benchmark of 1007 protein domains, we show that CE-Symm performs significantly better than previous approaches. We use CE-Symm to build a census of symmetry among domain superfamilies in SCOP and note that 18% of all superfamilies are pseudo-symmetric. Our results indicate that more domains are pseudo-symmetric than previously estimated. We establish a number of recurring types of symmetry-function relationships and describe several characteristic cases in detail. With the use of the Enzyme Commission classification, symmetry was found to be enriched in some enzyme classes but depleted in others. CE-Symm thus provides a methodology for a more complete and detailed study of the role of symmetry in tertiary protein structure [availability: CE-Symm can be run from the Web at Source code and software binaries are also available under the GNU Lesser General Public License (version 2.1) at An interactive census of domains identified as symmetric by CE-Symm is available from].

Keywords: protein evolution; protein function; pseudo-symmetry; structural biology; symmetry detection.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Computational Biology / methods
  • Databases, Protein
  • Humans
  • Models, Molecular
  • Protein Folding
  • Protein Structure, Tertiary*
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*
  • Sequence Homology, Amino Acid
  • Software*