DB-EAC and LSTR: DBnet based seal text detection and Lightweight Seal Text Recognition

PLoS One. 2024 May 16;19(5):e0301862. doi: 10.1371/journal.pone.0301862. eCollection 2024.

Abstract

Recognition of the key text of the Chinese seal can speed up the approval of documents, and improve the office efficiency of enterprises or government administrative departments. Due to image blurring and occlusion, the accuracy of Chinese seal recognition is low. In addition, the real dataset is very limited. In order to solve these problems, we improve the differentiable binarization detection algorithm (DBnet) to construct a model DB-ECA for text region detection, and propose a model named LSTR (Lightweight Seal Text Recognition) for text recognition. The efficient channel attention module is added to the differentiable binarization network to solve the feature pyramid conflict, and the convolutional layer network structure is improved to delay downsampling for reducing semantic feature loss. LSTR uses a lightweight CNN more suitable for small-sample generalization, and dynamically fuses positional and visual information through a self-attention-based inference layer to predict the label distribution of feature sequences in parallel. The inference layer not only solves the weak discriminative power of CNN in the shallow layer, but also facilitates CTC (Connectionist Temporal Classification) to accurately align the feature region with the target character. Experiments on the homemade dataset in this paper, DB-ECA compared with the other five commonly used detection models, the precision, recall, F-measure are the best effect of 90.29, 85.17, 87.65, respectively. LSTR compared with the other five kinds of recognition models in the last three years, to achieve the highest effect of accuracy 91.29%, and has the advantages of a small number of parameters and fast inference. The experimental results fully prove the innovation and effectiveness of our model.

MeSH terms

  • Algorithms*
  • Neural Networks, Computer
  • Pattern Recognition, Automated / methods

Grants and funding

This work was financially supported by the National Natural Science Foundation of China (Grant no. 61962005), but the funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.