Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jun 1;31(11):1872-4.
doi: 10.1093/bioinformatics/btv045. Epub 2015 Jan 24.

ENVIRONMENTS and EOL: Identification of Environment Ontology Terms in Text and the Annotation of the Encyclopedia of Life

Free PMC article

ENVIRONMENTS and EOL: Identification of Environment Ontology Terms in Text and the Annotation of the Encyclopedia of Life

Evangelos Pafilis et al. Bioinformatics. .
Free PMC article


The association of organisms to their environments is a key issue in exploring biodiversity patterns. This knowledge has traditionally been scattered, but textual descriptions of taxa and their habitats are now being consolidated in centralized resources. However, structured annotations are needed to facilitate large-scale analyses. Therefore, we developed ENVIRONMENTS, a fast dictionary-based tagger capable of identifying Environment Ontology (ENVO) terms in text. We evaluate the accuracy of the tagger on a new manually curated corpus of 600 Encyclopedia of Life (EOL) species pages. We use the tagger to associate taxa with environments by tagging EOL text content monthly, and integrate the results into the EOL to disseminate them to a broad audience of users.

Availability and implementation: The software and the corpus are available under the open-source BSD and the CC-BY-NC-SA 3.0 licenses, respectively, at


Fig. 1.
Fig. 1.
Top: The “Overview” tab of the EOL taxon pages show a subset of the ENVO terms obtained through text mining; an extended list of such terms is available in the “Data” tab. Parts of the page have been resized to improve readability. Bottom: The latter list provides links to the EOL text sections where each term was found (highlighted in bold)

Similar articles

See all similar articles

Cited by 8 articles

  • Seqenv: linking sequences to environments through text mining.
    Sinclair L, Ijaz UZ, Jensen LJ, Coolen MJL, Gubry-Rangin C, Chroňáková A, Oulas A, Pavloudi C, Schnetzer J, Weimann A, Ijaz A, Eiler A, Quince C, Pafilis E. Sinclair L, et al. PeerJ. 2016 Dec 20;4:e2690. doi: 10.7717/peerj.2690. eCollection 2016. PeerJ. 2016. PMID: 28028456 Free PMC article.
  • Pressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges.
    Singhal A, Leaman R, Catlett N, Lemberger T, McEntyre J, Polson S, Xenarios I, Arighi C, Lu Z. Singhal A, et al. Database (Oxford). 2016 Dec 26;2016:baw161. doi: 10.1093/database/baw161. Print 2016. Database (Oxford). 2016. PMID: 28025348 Free PMC article.
  • The flora phenotype ontology (FLOPO): tool for integrating morphological traits and phenotypes of vascular plants.
    Hoehndorf R, Alshahrani M, Gkoutos GV, Gosline G, Groom Q, Hamann T, Kattge J, de Oliveira SM, Schmidt M, Sierra S, Smets E, Vos RA, Weiland C. Hoehndorf R, et al. J Biomed Semantics. 2016 Nov 14;7(1):65. doi: 10.1186/s13326-016-0107-8. J Biomed Semantics. 2016. PMID: 27842607 Free PMC article.
  • The environment ontology in 2016: bridging domains with increased scope, semantic density, and interoperation.
    Buttigieg PL, Pafilis E, Lewis SE, Schildhauer MP, Walls RL, Mungall CJ. Buttigieg PL, et al. J Biomed Semantics. 2016 Sep 23;7(1):57. doi: 10.1186/s13326-016-0097-6. J Biomed Semantics. 2016. PMID: 27664130 Free PMC article.
  • Overview of the interactive task in BioCreative V.
    Wang Q, S Abdul S, Almeida L, Ananiadou S, Balderas-Martínez YI, Batista-Navarro R, Campos D, Chilton L, Chou HJ, Contreras G, Cooper L, Dai HJ, Ferrell B, Fluck J, Gama-Castro S, George N, Gkoutos G, Irin AK, Jensen LJ, Jimenez S, Jue TR, Keseler I, Madan S, Matos S, McQuilton P, Milacic M, Mort M, Natarajan J, Pafilis E, Pereira E, Rao S, Rinaldi F, Rothfels K, Salgado D, Silva RM, Singh O, Stefancsik R, Su CH, Subramani S, Tadepally HD, Tsaprouni L, Vasilevsky N, Wang X, Chatr-Aryamontri A, Laulederkind SJ, Matis-Mitchell S, McEntyre J, Orchard S, Pundir S, Rodriguez-Esteban R, Van Auken K, Lu Z, Schaeffer M, Wu CH, Hirschman L, Arighi CN. Wang Q, et al. Database (Oxford). 2016 Sep 1;2016:baw119. doi: 10.1093/database/baw119. Print 2016. Database (Oxford). 2016. PMID: 27589961 Free PMC article.
See all "Cited by" articles


    1. Bossy R., et al. (2013) BioNLP shared task 2013—an overview of the bacteria biotope task. ACL 2013. In: Proceedings of the BioNLP Shared Task 2013 Workshop, pp. 161–169.
    1. Buttigieg P.L., et al. (2013) The environment ontology: contextualising biological and biomedical entities. J. Biomed. Semantics, 4, 43. - PMC - PubMed
    1. Gwinn N.E., Rinaldo C. (2009) The biodiversity heritage library: sharing biodiversity literature with the world. IFLA J., 35, 25–34.
    1. Hirschman L., et al. (2008) Habitat-lite: a GSC case study based on free text terms for environmental metadata. OMICS, 12, 129–136. - PubMed
    1. Pafilis E., et al. (2013) The SPECIES and ORGANISMS resources for fast and accurate identification of taxonomic names in text. PLoS One, 8, e65390. - PMC - PubMed

Publication types