Introducing high correlation and high quality instances for few-shot entity linking

Neural Netw. 2025 Jan:181:106783. doi: 10.1016/j.neunet.2024.106783. Epub 2024 Oct 9.

Abstract

Entity linking, the process of connecting textual mentions in documents to canonical entities within a knowledge base, plays an integral role in a myriad of natural language processing tasks. A significant challenge prevalent within the field is the scarcity of resources, particularly for multiple specialized domains, which accentuates the importance of few-shot entity linking in real-world scenarios. Previous works address the problem of lacking in-domain labeled data by generating synthetic data. However, we argue that the synthetic data is frequently far from high-quality, such low-quality instances will introduce noise and diminish the ability of entity linking models to comprehend the semantic consistency between mentions and entities. In this paper, we propose a H2FEL framework to introduce high correlation and high quality instances for few-shot entity linking. We argue that there are rich high-quality labeled data in general domains and some of them are highly correlated to the target domain. Thus, we first design an adversarial instance extraction module to extract such high-correlation instances without depending on additional manually annotated data. To further mitigate the negative effects brought by low-correlation instances, we train our entity linking model via a variant of curriculum learning. Experimental results on the few-shot entity linking dataset demonstrate the effectiveness of our proposed H2FEL framework and it achieves state-of-the-art performance.

Keywords: Deep neural learning; Entity linking; Few-shot learning; Natural language processing.

MeSH terms

  • Humans
  • Knowledge Bases
  • Natural Language Processing*
  • Neural Networks, Computer
  • Semantics