DRCNN: Dynamic Routing Convolutional Neural Network for Multi-View 3D Object Recognition

Kai Sun; Jiangshe Zhang; Junmin Liu; Ruixuan Yu; Zengjie Song

doi:10.1109/TIP.2020.3039378

DRCNN: Dynamic Routing Convolutional Neural Network for Multi-View 3D Object Recognition

IEEE Trans Image Process. 2021:30:868-877. doi: 10.1109/TIP.2020.3039378. Epub 2020 Dec 4.

Authors

Kai Sun, Jiangshe Zhang, Junmin Liu, Ruixuan Yu, Zengjie Song

PMID: 33237859
DOI: 10.1109/TIP.2020.3039378

Abstract

3D object recognition is one of the most important tasks in 3D data processing, and has been extensively studied recently. Researchers have proposed various 3D recognition methods based on deep learning, among which a class of view-based approaches is a typical one. However, in the view-based methods, the commonly used view pooling layer to fuse multi-view features causes a loss of visual information. To alleviate this problem, in this paper, we construct a novel layer called Dynamic Routing Layer (DRL) by modifying the dynamic routing algorithm of capsule network, to more effectively fuse the features of each view. Concretely, in DRL, we use rearrangement and affine transformation to convert features, then leverage the modified dynamic routing algorithm to adaptively choose the converted features, instead of ignoring all but the most active feature in view pooling layer. We also illustrate that the view pooling layer is a special case of our DRL. In addition, based on DRL, we further present a Dynamic Routing Convolutional Neural Network (DRCNN) for multi-view 3D object recognition. Our experiments on three 3D benchmark datasets show that our proposed DRCNN outperforms many state-of-the-arts, which demonstrates the efficacy of our method.