Deep Recognition of Vanishing-Point-Constrained Building Planes in Urban Street Views

IEEE Trans Image Process. 2020 Apr 15. doi: 10.1109/TIP.2020.2986894. Online ahead of print.

Abstract

This paper presents a new approach to recognizing vanishing-point-constrained building planes from a single image of street view. We first design a novel convolutional neural network (CNN) architecture that generates geometric segmentation of per-pixel orientations from a single street-view image. The network combines two-stream features of general visual cues and surface normals in gated convolution layers, and employs a deeply supervised loss that encapsulates multi-scale convolutional features. Our experiments on a new benchmark with fine-grained plane segmentations of real-world street views show that our network outperforms state-of-the-arts methods of both semantic and geometric segmentation. The pixel-wise segmentation exhibits coarse boundaries and discontinuities. We then propose to rectify the pixel-wise segmentation into perspectively-projected quads based on spatial proximity between the segmentation masks and exterior line segments detected through an image processing. We demonstrate how the results can be utilized to perspectively overlay images and icons on building planes in input photos, and provide visual cues for various applications.