Memristor crossbars can be very useful for realizing edge-intelligence hardware, because the neural networks implemented by memristor crossbars can save significantly more computing energy and layout area than the conventional CMOS (complementary metal-oxide-semiconductor) digital circuits. One of the important operations used in neural networks is convolution. For performing the convolution by memristor crossbars, the full image should be partitioned into several sub-images. By doing so, each sub-image convolution can be mapped to small-size unit crossbars, of which the size should be defined as 128 × 128 or 256 × 256 to avoid the line resistance problem caused from large-size crossbars. In this paper, various convolution schemes with 3D, 2D, and 1D kernels are analyzed and compared in terms of neural network's performance and overlapping overhead. The neural network's simulation indicates that the 2D + 1D kernels can perform the sub-image convolution using a much smaller number of unit crossbars with less rate loss than the 3D kernels. When the CIFAR-10 dataset is tested, the mapping of sub-image convolution of 2D + 1D kernels to crossbars shows that the number of unit crossbars can be reduced almost by 90% and 95%, respectively, for 128 × 128 and 256 × 256 crossbars, compared with the 3D kernels. On the contrary, the rate loss of 2D + 1D kernels can be less than 2%. To improve the neural network's performance more, the 2D + 1D kernels can be combined with 3D kernels in one neural network. When the normalized ratio of 2D + 1D layers is around 0.5, the neural network's performance indicates very little rate loss compared to when the normalized ratio of 2D + 1D layers is zero. However, the number of unit crossbars for the normalized ratio = 0.5 can be reduced by half compared with that for the normalized ratio = 0.
Keywords: area-efficient mapping; convolutional neural networks; memristor crossbars; sub-image partitioning.