Prediction and analysis of cell-penetrating peptides using pseudo-amino acid composition and random forest models

Amino Acids. 2015 Jul;47(7):1485-93. doi: 10.1007/s00726-015-1974-5. Epub 2015 Apr 18.

Abstract

Cell-penetrating peptides, a group of short peptides, can traverse cell membranes to enter cells and thus facilitate the uptake of various molecular cargoes. Thus, they have the potential to become powerful drug delivery systems. The correct identification of peptides as cell-penetrating or non-cell-penetrating would accelerate this application. In this study, we determined which features were important for a peptide to be cell-penetrating or non-cell-penetrating and built a predictive model based on the key features extracted from this analysis. The investigated peptides were retrieved from a previous study, and each was encoded as a numeric vector according to six properties of amino acids-amino acid frequency, codon diversity, electrostatic charge, molecular volume, polarity, and secondary structure-by the pseudo-amino acid composition method. Methods of minimum redundancy maximum relevance and incremental feature selection were then employed to analyze these features, and some were found to be key determinants of cell penetration. In parallel, an optimal random forest prediction model was built. We hope that our findings will provide new resources for the study of cell-penetrating peptides.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Cell-Penetrating Peptides / chemistry*
  • Codon
  • Decision Trees
  • Models, Chemical
  • Models, Molecular
  • Protein Structure, Secondary

Substances

  • Cell-Penetrating Peptides
  • Codon