Research on calibration of a binocular stereo-vision imaging system based on the artificial neural network

J Opt Soc Am A Opt Image Sci Vis. 2023 Feb 1;40(2):337-354. doi: 10.1364/JOSAA.469332.

Abstract

Camera calibration is a key problem for 3D reconstruction in computer vision. Existing calibration methods, such as traditional, active, and self-calibration, all need to solve the internal and external parameters of the imaging system to clarify the image-object mapping relationship. The artificial neural network, which is based on connectionist architecture, provides a novel idea for the calibration of nonlinear mapping vision systems. It can learn the image-object mapping relationship from some sample points without considering too many uncertain factors in the middle. This paper discusses the learning ability. A binocular stereo-vision mapping model is used as the learning model to explore the ability of image-object mapping for artificial neural networks. This paper constructs sample libraries by pixel and world coordinates of checkerboard corners, builds the artificial neural network, and, through the training samples and test samples prediction, verifies the learning performance of the network. Furthermore, by the laser scanning binocular vision device constructed in the authors' laboratory and trained-well network, the 3D point cloud reconstruction of a physical target is performed. The experimental results show that the artificial neural network can learn the image-object mapping relationship well and more effectively avoid the impact of lens distortion and achieve more accurate nonlinear mapping at the edge of the image. When the X and Y coordinates are in the range of 100 mm and the Z coordinates are in the range of a 1000 mm, the absolute error rarely exceeds 2.5 mm, and the relative error is in the level of 10-3; for 1000 mm distance measurement, the standard deviation does not exceed 1.5 mm. Network parameter selection experiments show that, for image-object mapping, a three-layer network and increasing the number of hidden layer's nodes can improve the training time more significantly.