Lactylation prediction models based on protein sequence and structural feature fusion

Brief Bioinform. 2024 Jan 22;25(2):bbad539. doi: 10.1093/bib/bbad539.

Abstract

Lysine lactylation (Kla) is a newly discovered posttranslational modification that is involved in important life activities, such as glycolysis-related cell function, macrophage polarization and nervous system regulation, and has received widespread attention due to the Warburg effect in tumor cells. In this work, we first design a natural language processing method to automatically extract the 3D structural features of Kla sites, avoiding potential biases caused by manually designed structural features. Then, we establish two Kla prediction frameworks, Attention-based feature fusion Kla model (ABFF-Kla) and EBFF-Kla, to integrate the sequence features and the structure features based on the attention layer and embedding layer, respectively. The results indicate that ABFF-Kla and Embedding-based feature fusion Kla model (EBFF-Kla), which fuse features from protein sequences and spatial structures, have better predictive performance than that of models that use only sequence features. Our work provides an approach for the automatic extraction of protein structural features, as well as a flexible framework for Kla prediction. The source code and the training data of the ABFF-Kla and the EBFF-Kla are publicly deposited at: https://github.com/ispotato/Lactylation_model.

Keywords: automatic feature extraction; deep learning; feature fusion; lysine lactylation; residue contact map.

MeSH terms

  • Amino Acid Sequence
  • Lysine*
  • Natural Language Processing*
  • Protein Domains
  • Protein Processing, Post-Translational

Substances

  • Lysine