Semi-Supervised Learning Under General Causal Models

IEEE Trans Neural Netw Learn Syst. 2024 May 23:PP. doi: 10.1109/TNNLS.2024.3392750. Online ahead of print.

Abstract

Semi-supervised learning (SSL) aims to train a machine learning (ML) model using both labeled and unlabeled data. While the unlabeled data have been used in various ways to improve the prediction accuracy, the reason why unlabeled data could help is not fully understood. One interesting and promising direction is to understand SSL from a causal perspective. In light of the independent causal mechanisms (ICM) principle, the unlabeled data can be helpful when the label causes the features but not vice versa. However, the causal relations between the features and labels can be complex in real world applications. In this article, we propose an SSL framework that works with general causal models in which the variables have flexible causal relations. More specifically, we explore the causal graph structures and design corresponding causal generative models which can be learned with the help of unlabeled data. The learned causal generative model can generate synthetic labeled data for training a more accurate predictive model. We verify the effectiveness of our proposed method by empirical studies on both simulated and real data.