Modeling Crop Genetic Resources Phenotyping Information Systems

Front Plant Sci. 2019 Jun 21:10:728. doi: 10.3389/fpls.2019.00728. eCollection 2019.

Abstract

Documentation of phenotype information is a priority need in biodiversity, crop modeling, breeding, ecology, and evolution research, for association studies, gene discovery, retrospective statistical analysis and data mining, QTL re-mapping, choosing cultivars, and planning crosses. Lack of access to phenotype information is still seen as a limiting factor for the use of plant genetic resources. Phenotype data are complex. Information on the context, under which they were collected, is indispensable, and the domain is continuously evolving. This study describes comprehensive data and object models supporting web interfaces for multi-site field phenotyping and data acquisition, which have been developed for Central Crop Databases within the European Cooperative Programme for Plant Genetic Resources over the years and which can be used as blueprints for phenotyping information systems. We start from the hypothesis, that entity relationship and object models useful for software development can picture domain expertise, similar as domain ontologies, and encourage a discussion of scientific information systems on modeling level. Starting from information requirements for statistical analysis, meta-analysis, and knowledge discovery, models are discussed in consideration of several standardization and modeling approaches including crop ontologies. Following an object-oriented modeling approach, we keep data and object models close together and to domain concepts. This will make database and software design better understandable and usable for domain experts and support a modular use of software artifacts to be shared across various domains of expertise. Classes and entities represent domain concepts with attributes naturally assigned to them. Field experiments with randomized plots, as typically used in the evaluation of plant genetic resources and in plant breeding, are in the focus. Phenotype observations, which can be listed as raw or aggregated data, are linked to explanatory metadata describing experimental treatments and agronomic interventions, observed traits and observation methodology, field plan and plot design, and the experiment site as a geographical entity. Based on clearly defined types, potential links to information systems in other domains (e.g., geographic information systems) can be better identified. Work flows are shown as web applications for the generation of field plans, field books, templates, upload of spreadsheet data, and images.

Keywords: class models; documentation; entity relationship models; phenotyping; plant genetic resources; web applications; work flows.