Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun 26;8(6):e11351.
doi: 10.1002/aps3.11351. eCollection 2020 Jun.

GinJinn: An object-detection pipeline for automated feature extraction from herbarium specimens

Affiliations
Free PMC article

GinJinn: An object-detection pipeline for automated feature extraction from herbarium specimens

Tankred Ott et al. Appl Plant Sci. .
Free PMC article

Abstract

Premise: The generation of morphological data in evolutionary, taxonomic, and ecological studies of plants using herbarium material has traditionally been a labor-intensive task. Recent progress in machine learning using deep artificial neural networks (deep learning) for image classification and object detection has facilitated the establishment of a pipeline for the automatic recognition and extraction of relevant structures in images of herbarium specimens.

Methods and results: We implemented an extendable pipeline based on state-of-the-art deep-learning object-detection methods to collect leaf images from herbarium specimens of two species of the genus Leucanthemum. Using 183 specimens as the training data set, our pipeline extracted one or more intact leaves in 95% of the 61 test images.

Conclusions: We establish GinJinn as a deep-learning object-detection tool for the automatic recognition and extraction of individual leaves or other structures from herbarium specimens. Our pipeline offers greater flexibility and a lower entrance barrier than previous image-processing approaches based on hand-crafted features.

Keywords: TensorFlow; deep learning; herbarium specimens; object detection; visual recognition.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
Flow diagram of the six GinJinn pipeline steps. A project folder is generated using ginjinn new (1) and the configuration file is modified depending on the user’s needs (1.1). The preparation (2), processing (3), training (4), and export (5) steps are executed sequentially with specific GinJinn commands (setup_dataset, setup_model, train, and export, respectively), or alternatively at once with the single ginjinn auto command. When not using ginjinn auto, the user can modify intermediary TensorFlow configuration files (3.1) for additional control over the model parameters and augmentation options. The trained and exported model can be used for inference of bounding boxes on new data using ginjinn detect. GinJinn commands are indicated by the yellow process boxes. Data inputs and outputs are illustrated with solid and dashed arrows, respectively. After bounding box detection, the extracted structures of interest can be supplied to other tools for downstream analyses.
FIGURE 2
FIGURE 2
(A) Output type ‘ibb’ (image with bounding boxes) showing class‐wise predicted bounding boxes of leaves with a score of 0.5 or higher drawn on the original image of a herbarium specimen. The score can be interpreted as a probability that the content of the bounding box belongs to a certain object class (in this case, a leaf). (B) Output type ‘ebb’ (extracted bounding boxes with a padding of 25 pixels) for selected true positive examples of the detected leaves shown in A. (C) Output type ‘ebb’ for selected false positive examples of the leaves shown in A.

Similar articles

Cited by

References

    1. Abadi, M. , Agarwal A., Barham P., Brevdo E., Chen Z., Citro C., Corrado G. S., et al. 2016. TensorFlow: Large‐scale machine learning on heterogeneous distributed systems. arXiv 1603.04467 [cs] [Preprint]. Published 14 March 2016 [accessed 6 May 2020]. Available at: https://arxiv.org/abs/1603.04467.
    1. Abhishek, D. , Zisserman, A. 2019. The VIA annotation software for images, audio and video In Proceedings of the 27th ACM International Conference on Multimedia (MM ’19), October 21–25, 2019, Nice, France. New York: ACM.
    1. Bonhomme, V. , Picq S., and Gaucherel C.. 2014. Momocs: Outline analysis using R. Journal of Statistical Software 56(13): 1–24.
    1. Carranza‐Rojas, J. , Goeau H., Bonnet P., Mata‐Montero E., and Joly A.. 2017. Going deeper in the automated identification of herbarium specimens. BMC Evolutionary Biology 17: 181–194. - PMC - PubMed
    1. Chuanromanee, T. S. , Cohen J. I., and Rya G. L.. 2019. Morphological Analysis of Size and Shape (MASS): An integrative software program for morphometric analyses of leaves. Applications in Plant Sciences 7(9): e11288. - PMC - PubMed

LinkOut - more resources