Remote sensing image description based on word embedding and end-to-end deep learning

Sci Rep. 2021 Feb 4;11(1):3162. doi: 10.1038/s41598-021-82704-4.

Abstract

This study proposes an end-to-end image description generation model based on word embedding technology to realise the classification and identification of Populus euphratica and Tamarix in complex remote sensing images by providing descriptions in precise and concise natural sentences. First, category ambiguity over large-scale regions in remote sensing images is addressed by introducing the co-occurrence matrix and global vectors for word representation to generate the word vector features of the object to be identified. Second, a new multi-level end-to-end model is employed to further describe the content of remote sensing images and to better advance the description tasks for P. euphratica and Tamarix in remote sensing images. Experimental results reveal that the natural language sentences generated using this method can better describe P. euphratica and Tamarix in remote sensing images compared with conventional deep learning methods.