LEARNING GENERAL TRANSFORMATIONS OF DATA FOR OUT-OF-SAMPLE EXTENSIONS

IEEE Int Workshop Mach Learn Signal Process. 2020 Sep;2020:10.1109/mlsp49062.2020.9231660. doi: 10.1109/mlsp49062.2020.9231660. Epub 2020 Oct 20.

Abstract

While generative models such as GANs have been successful at mapping from noise to specific distributions of data, or more generally from one distribution of data to another, they cannot isolate the transformation that is occurring and apply it to a new distribution not seen in training. Thus, they memorize the domain of the transformation, and cannot generalize the transformation out of sample. To address this, we propose a new neural network called a Neuron Transformation Network (NTNet) that isolates the signal representing the transformation itself from the other signals representing internal distribution variation. This signal can then be removed from a new dataset distributed differently from the original one trained on. We demonstrate the effectiveness of our NTNet on more than a dozen synthetic and biomedical single-cell RNA sequencing datasets, where the NTNet is able to learn the data transformation performed by genetic and drug perturbations on one sample of cells and successfully apply it to another sample of cells to predict treatment outcome.