LDAformer: predicting lncRNA-disease associations based on topological feature extraction and Transformer encoder

Brief Bioinform. 2022 Nov 19;23(6):bbac370. doi: 10.1093/bib/bbac370.

Abstract

The identification of long noncoding RNA (lncRNA)-disease associations is of great value for disease diagnosis and treatment, and it is now commonly used to predict potential lncRNA-disease associations with computational methods. However, the existing methods do not sufficiently extract key features during data processing, and the learning model parts are either less powerful or overly complex. Therefore, there is still potential to achieve better predictive performance by improving these two aspects. In this work, we propose a novel lncRNA-disease association prediction method LDAformer based on topological feature extraction and Transformer encoder. We construct the heterogeneous network by integrating the associations between lncRNAs, diseases and micro RNAs (miRNAs). Intra-class similarities and inter-class associations are presented as the lncRNA-disease-miRNA weighted adjacency matrix to unify semantics. Next, we design a topological feature extraction process to further obtain multi-hop topological pathway features latent in the adjacency matrix. Finally, to capture the interdependencies between heterogeneous pathways, a Transformer encoder based on the global self-attention mechanism is employed to predict lncRNA-disease associations. The efficient feature extraction and the intuitive and powerful learning model lead to ideal performance. The results of computational experiments on two datasets show that our method outperforms the state-of-the-art baseline methods. Additionally, case studies further indicate its capability to discover new associations accurately.

Keywords: Transformer; global self-attention mechanism; lncRNA-disease association; topological feature extraction.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods
  • Humans
  • MicroRNAs* / genetics
  • Neoplasms* / genetics
  • RNA, Long Noncoding* / genetics
  • RNA, Long Noncoding* / metabolism

Substances

  • RNA, Long Noncoding
  • MicroRNAs