How significant is a protein structure similarity with TM-score = 0.5?
- PMID: 20164152
- PMCID: PMC2913670
- DOI: 10.1093/bioinformatics/btq066
How significant is a protein structure similarity with TM-score = 0.5?
Abstract
Motivation: Protein structure similarity is often measured by root mean squared deviation, global distance test score and template modeling score (TM-score). However, the scores themselves cannot provide information on how significant the structural similarity is. Also, it lacks a quantitative relation between the scores and conventional fold classifications. This article aims to answer two questions: (i) what is the statistical significance of TM-score? (ii) What is the probability of two proteins having the same fold given a specific TM-score?
Results: We first made an all-to-all gapless structural match on 6684 non-homologous single-domain proteins in the PDB and found that the TM-scores follow an extreme value distribution. The data allow us to assign each TM-score a P-value that measures the chance of two randomly selected proteins obtaining an equal or higher TM-score. With a TM-score at 0.5, for instance, its P-value is 5.5 x 10(-7), which means we need to consider at least 1.8 million random protein pairs to acquire a TM-score of no less than 0.5. Second, we examine the posterior probability of the same fold proteins from three datasets SCOP, CATH and the consensus of SCOP and CATH. It is found that the posterior probability from different datasets has a similar rapid phase transition around TM-score=0.5. This finding indicates that TM-score can be used as an approximate but quantitative criterion for protein topology classification, i.e. protein pairs with a TM-score >0.5 are mostly in the same fold while those with a TM-score <0.5 are mainly not in the same fold.
Figures
Similar articles
-
Systematic comparison of SCOP and CATH: a new gold standard for protein structure analysis.BMC Struct Biol. 2009 Apr 17;9:23. doi: 10.1186/1472-6807-9-23. BMC Struct Biol. 2009. PMID: 19374763 Free PMC article.
-
Automatic classification of protein structures using low-dimensional structure space mappings.BMC Bioinformatics. 2014;15 Suppl 2(Suppl 2):S1. doi: 10.1186/1471-2105-15-S2-S1. Epub 2014 Jan 24. BMC Bioinformatics. 2014. PMID: 24564500 Free PMC article.
-
Further evidence for the likely completeness of the library of solved single domain protein structures.J Phys Chem B. 2012 Jun 14;116(23):6654-64. doi: 10.1021/jp211052j. Epub 2012 Feb 13. J Phys Chem B. 2012. PMID: 22272723 Free PMC article.
-
Automated assignment of SCOP and CATH protein structure classifications from FSSP scores.Proteins. 2002 Mar 1;46(4):405-15. doi: 10.1002/prot.1176. Proteins. 2002. PMID: 11835515
-
TAPO: A combined method for the identification of tandem repeats in protein structures.FEBS Lett. 2015 Sep 14;589(19 Pt A):2611-9. doi: 10.1016/j.febslet.2015.08.025. Epub 2015 Aug 29. FEBS Lett. 2015. PMID: 26320412 Review.
Cited by
-
Structure prediction of alternative protein conformations.Nat Commun. 2024 Aug 26;15(1):7328. doi: 10.1038/s41467-024-51507-2. Nat Commun. 2024. PMID: 39187507 Free PMC article.
-
Atomic-level protein structure refinement using fragment-guided molecular dynamics conformation sampling.Structure. 2011 Dec 7;19(12):1784-95. doi: 10.1016/j.str.2011.09.022. Structure. 2011. PMID: 22153501 Free PMC article.
-
Progressive and accurate assembly of multi-domain protein structures from cryo-EM density maps.bioRxiv [Preprint]. 2020 Oct 16:2020.10.15.340455. doi: 10.1101/2020.10.15.340455. bioRxiv. 2020. Update in: Nat Comput Sci. 2022 Apr;2(4):265-275. doi: 10.1038/s43588-022-00232-1 PMID: 33083802 Free PMC article. Updated. Preprint.
-
CavitOmiX Drug Discovery: Engineering Antivirals with Enhanced Spectrum and Reduced Side Effects for Arboviral Diseases.Viruses. 2024 Jul 24;16(8):1186. doi: 10.3390/v16081186. Viruses. 2024. PMID: 39205160 Free PMC article.
-
Benchmarking reverse docking through AlphaFold2 human proteome.Protein Sci. 2024 Oct;33(10):e5167. doi: 10.1002/pro.5167. Protein Sci. 2024. PMID: 39276010
References
-
- Ben-David M, et al. Assess ment of CASP8 structure predictions for template free targets. Proteins. 2009;77(Suppl. 9):50–65. - PubMed
-
- Berman HM, et al. The protein data bank. Acta Crystallogr., Sect D: Biol. Crystallogr. 2002;58:899–907. - PubMed
-
- Betancourt MR, Skolnick J. Universal similarity measure for comparing protein structures. Biopolymers. 2001;59:305–309. - PubMed
-
- Chothia C, et al. Evolution of the protein repertoire. Science. 2003;300:1701–1703. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
