Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jun 25;10:813.
doi: 10.3389/fpls.2019.00813. eCollection 2019.

Cyberinfrastructure to Improve Forest Health and Productivity: The Role of Tree Databases in Connecting Genomes, Phenomes, and the Environment

Free PMC article

Cyberinfrastructure to Improve Forest Health and Productivity: The Role of Tree Databases in Connecting Genomes, Phenomes, and the Environment

Jill L Wegrzyn et al. Front Plant Sci. .
Free PMC article


Despite tremendous advancements in high throughput sequencing, the vast majority of tree genomes, and in particular, forest trees, remain elusive. Although primary databases store genetic resources for just over 2,000 forest tree species, these are largely focused on sequence storage, basic genome assemblies, and functional assignment through existing pipelines. The tree databases reviewed here serve as secondary repositories for community data. They vary in their focal species, the data they curate, and the analytics provided, but they are united in moving toward a goal of centralizing both data access and analysis. They provide frameworks to view and update annotations for complex genomes, interrogate systems level expression profiles, curate data for comparative genomics, and perform real-time analysis with genotype and phenotype data. The organism databases of today are no longer simply catalogs or containers of genetic information. These repositories represent integrated cyberinfrastructure that support cross-site queries and analysis in web-based environments. These resources are striving to integrate across diverse experimental designs, sequence types, and related measures through ontologies, community standards, and web services. Efficient, simple, and robust platforms that enhance the data generated by the research community, contribute to improving forest health and productivity.

Keywords: bioinformatics; content management system; database; forest tree; web services.


PlantGenIE, TreeGenes, and Hardwood Genomics Project represent integrated web-based frameworks that rely on a combination of primary repositories, secondary plant comparative databases, and user submissions to provide further value through data curation, integration, and analytics.

Similar articles

  • Cyberinfrastructure and resources to enable an integrative approach to studying forest trees.
    Wegrzyn JL, Falk T, Grau E, Buehler S, Ramnath R, Herndon N. Wegrzyn JL, et al. Evol Appl. 2019 Nov 3;13(1):228-241. doi: 10.1111/eva.12860. eCollection 2020 Jan. Evol Appl. 2019. PMID: 31892954 Free PMC article.
  • Growing and cultivating the forest genomics database, TreeGenes.
    Falk T, Herndon N, Grau E, Buehler S, Richter P, Zaman S, Baker EM, Ramnath R, Ficklin S, Staton M, Feltus FA, Jung S, Main D, Wegrzyn JL. Falk T, et al. Database (Oxford). 2018 Jan 1;2018:1-11. doi: 10.1093/database/bay084. Database (Oxford). 2018. PMID: 30239664 Free PMC article.
  • CartograTree: connecting tree genomes, phenotypes and environment.
    Vasquez-Gross HA, Yu JJ, Figueroa B, Gessler DD, Neale DB, Wegrzyn JL. Vasquez-Gross HA, et al. Mol Ecol Resour. 2013 May;13(3):528-37. doi: 10.1111/1755-0998.12067. Epub 2013 Feb 25. Mol Ecol Resour. 2013. PMID: 23433187
  • [Progress in research on forest tree genomics].
    Gan SM, Su XH. Gan SM, et al. Zhi Wu Sheng Li Yu Fen Zi Sheng Wu Xue Xue Bao. 2006 Apr;32(2):133-42. Zhi Wu Sheng Li Yu Fen Zi Sheng Wu Xue Xue Bao. 2006. PMID: 16622311 Review. Chinese.
  • Mitochondrial Disease Sequence Data Resource (MSeqDR): a global grass-roots consortium to facilitate deposition, curation, annotation, and integrated analysis of genomic data for the mitochondrial disease clinical and research communities.
    Falk MJ, Shen L, Gonzalez M, Leipzig J, Lott MT, Stassen AP, Diroma MA, Navarro-Gomez D, Yeske P, Bai R, Boles RG, Brilhante V, Ralph D, DaRe JT, Shelton R, Terry SF, Zhang Z, Copeland WC, van Oven M, Prokisch H, Wallace DC, Attimonelli M, Krotoski D, Zuchner S, Gai X; MSeqDR Consortium Participants; MSeqDR Consortium participants: Sherri Bale, Jirair Bedoyan, Doron Behar, Penelope Bonnen, Lisa Brooks, Claudia Calabrese, Sarah Calvo, Patrick Chinnery, John Christodoulou, Deanna Church,; Rosanna Clima, Bruce H. Cohen, Richard G. Cotton, IFM de Coo, Olga Derbenevoa, Johan T. den Dunnen, David Dimmock, Gregory Enns, Giuseppe Gasparre,; Amy Goldstein, Iris Gonzalez, Katrina Gwinn, Sihoun Hahn, Richard H. Haas, Hakon Hakonarson, Michio Hirano, Douglas Kerr, Dong Li, Maria Lvova, Finley Macrae, Donna Maglott, Elizabeth McCormick, Grant Mitchell, Vamsi K. Mootha, Yasushi Okazaki,; Aurora Pujol, Melissa Parisi, Juan Carlos Perin, Eric A. Pierce, Vincent Procaccio, Shamima Rahman, Honey Reddi, Heidi Rehm, Erin Riggs, Richard Rodenburg, Yaffa Rubinstein, Russell Saneto, Mariangela Santorsola, Curt Scharfe,; Claire Sheldon, Eric A. Shoubridge, Domenico Simone, Bert Smeets, Jan A. Smeitink, Christine Stanley, Anu Suomalainen, Mark Tarnopolsky, Isabelle Thiffault, David R. Thorburn, Johan Van Hove, Lynne Wolfe, and Lee-Jun Wong. Falk MJ, et al. Mol Genet Metab. 2015 Mar;114(3):388-96. doi: 10.1016/j.ymgme.2014.11.016. Epub 2014 Dec 4. Mol Genet Metab. 2015. PMID: 25542617 Free PMC article. Review.
See all similar articles

Cited by 2 articles


    1. Afgan E., Baker D., Batut B., van den Beek M., Bouvier D., Čech M., et al. (2018). The galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Res. 46 W537–W544. 10.1093/nar/gky379 - DOI - PMC - PubMed
    1. Ashburner M., Ball C., Blake J., Botstein D., Butler H., Cherry J., et al. (2000). Gene ontology: tool for the unification of biology. the gene ontology consortium. Nat. Genet. 25 25–29. - PMC - PubMed
    1. Barone L., Williams J., Micklos D. (2017). Unmet needs for analyzing biological big data: a survey of 704 NSF principal investigators. PLoS Comp. Biol. 13:e1005755. 10.1371/journal.pcbi.1005755 - DOI - PMC - PubMed
    1. Benson D. A., Boguski M. S., Lipman D. J., Ostell J. (1997). GenBank. Nucleic Acids Res. 25 1–6. 10.1093/nar/25.1.1 - DOI - PMC - PubMed
    1. Birol I., Raymond A., Jackman S. D., Pleasance S., Coope R., Taylor G. A., et al. (2013). Assembling the 20 Gb white spruce (Picea glauca) genome from whole-genome shotgun sequencing data. Bioinformatics 29 1492–1497. 10.1093/bioinformatics/btt178 - DOI - PMC - PubMed

LinkOut - more resources