De novo Protein Structure Prediction by Coupling Contact With Distance Profile

IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):395-406. doi: 10.1109/TCBB.2020.3000758. Epub 2022 Feb 3.

Abstract

De novo protein structure prediction is a challenging problem that requires both an accurate energy function and an efficient conformation sampling method. In this study, a de novo structure prediction method, named CoDiFold, is proposed. In CoDiFold, contacts and distance profiles are organically combined into the Rosetta low-resolution energy function to improve the accuracy of energy function. As a result, the correlation between energy and root mean square deviation (RMSD) is improved. In addition, a population-based multi-mutation strategy is designed to balance the exploration and exploitation of conformation space sampling. The average RMSD of the models generated by the proposed protocol is decreased by 49.24 and 45.21 percent in the test set with 43 proteins compared with those of Rosetta and QUARK de novo protocols, respectively. The results also demonstrate that the structures predicted by proposed CoDiFold are comparable to the state-of-the-art methods for the 10 FM targets of CASP13. The source code and executable versions are freely available at http://github.com/iobio-zjut/CoDiFold.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Models, Molecular
  • Protein Conformation
  • Proteins* / genetics
  • Software*

Substances

  • Proteins