Harnessing Side Information for Classification Under Label Noise

IEEE Trans Neural Netw Learn Syst. 2020 Sep;31(9):3178-3192. doi: 10.1109/TNNLS.2019.2938782. Epub 2019 Sep 25.

Abstract

Practical data sets often contain the label noise caused by various human factors or measurement errors, which means that a fraction of training examples might be mistakenly labeled. Such noisy labels will mislead the classifier training and severely decrease the classification performance. Existing approaches to handle this problem are usually developed through various surrogate loss functions under the framework of empirical risk minimization. However, they are only suitable for binary classification and also require strong prior knowledge. Therefore, this article treats the example features as side information and formulates the noisy label removal problem as a matrix recovery problem. We denote our proposed method as "label noise handling via side information" (LNSI). Specifically, the observed label matrix is decomposed as the sum of two parts, in which the first part reveals the true labels and can be obtained by conducting a low-rank mapping on the side information; and the second part captures the incorrect labels and is modeled by a row-sparse matrix. The merits of such formulation lie in three aspects: 1) the strong recovery ability of this strategy has been sufficiently demonstrated by intensive theoretical works on side information; 2) multi-class situations can be directly handled with the aid of learned projection matrix; and 3) only very weak assumptions are required for model design, making LNSI applicable to a wide range of practical problems. Moreover, we theoretically derive the generalization bound of LNSI and show that the expected classification error of LNSI is upper bounded. The experimental results on a variety of data sets including UCI benchmark data sets and practical data sets confirm the superiority of LNSI to state-of-the-art approaches on label noise handling.

Publication types

  • Research Support, Non-U.S. Gov't