The protein threading problem with sequence amino acid interaction preferences is NP-complete

R H Lathrop

doi:10.1093/protein/7.9.1059

The protein threading problem with sequence amino acid interaction preferences is NP-complete

Protein Eng. 1994 Sep;7(9):1059-68. doi: 10.1093/protein/7.9.1059.

Author

R H Lathrop¹

Affiliation

¹ Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge 02139.

PMID: 7831276
DOI: 10.1093/protein/7.9.1059

Abstract

In recent protein structure prediction research there has been a great deal of interest in using amino acid interaction preferences (e.g. contact potentials or potentials of mean force) to align ('thread') a protein sequence to a known structural motif. An important open question is whether a polynomial time algorithm for finding the globally optimal threading is possible. We identify the two critical conditions governing this question: (i) variable-length gaps are admitted into the alignment, and (ii) interactions between amino acids from the sequence are admitted into the score function. We prove that if both these conditions are allowed then the protein threading decision problem (does there exist a threading with a score < or = K?) is NP-complete (in the strong sense, i.e. is not merely a number problem) and the related problem of finding the globally optimal protein threading is NP-hard. Therefore, no polynomial time algorithm is possible (unless P = NP). This result augments existing proofs that the direct protein folding problem is NP-complete by providing the corresponding proof for the 'inverse' protein folding problem. It provides a theoretical basis for understanding algorithms currently in use and indicates that computational strategies from other NP-complete problems may be useful for predictive algorithms.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Algorithms
Amino Acid Sequence
Amino Acids / chemistry
Molecular Structure
Protein Engineering* / methods
Protein Engineering* / statistics & numerical data
Protein Folding*
Proteins / chemistry*
Proteins / genetics

Substances

Amino Acids
Proteins

Grants and funding

RR02275-05/RR/NCRR NIH HHS/United States