Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Dec 21;11 Suppl 12(Suppl 12):S12.
doi: 10.1186/1471-2105-11-S12-S12.

The MOLGENIS Toolkit: Rapid Prototyping of Biosoftware at the Push of a Button

Affiliations
Free PMC article

The MOLGENIS Toolkit: Rapid Prototyping of Biosoftware at the Push of a Button

Morris A Swertz et al. BMC Bioinformatics. .
Free PMC article

Abstract

Background: There is a huge demand on bioinformaticians to provide their biologists with user friendly and scalable software infrastructures to capture, exchange, and exploit the unprecedented amounts of new *omics data. We here present MOLGENIS, a generic, open source, software toolkit to quickly produce the bespoke MOLecular GENetics Information Systems needed.

Methods: The MOLGENIS toolkit provides bioinformaticians with a simple language to model biological data structures and user interfaces. At the push of a button, MOLGENIS' generator suite automatically translates these models into a feature-rich, ready-to-use web application including database, user interfaces, exchange formats, and scriptable interfaces. Each generator is a template of SQL, JAVA, R, or HTML code that would require much effort to write by hand. This 'model-driven' method ensures reuse of best practices and improves quality because the modeling language and generators are shared between all MOLGENIS applications, so that errors are found quickly and improvements are shared easily by a re-generation. A plug-in mechanism ensures that both the generator suite and generated product can be customized just as much as hand-written software.

Results: In recent years we have successfully evaluated the MOLGENIS toolkit for the rapid prototyping of many types of biomedical applications, including next-generation sequencing, GWAS, QTL, proteomics and biobanking. Writing 500 lines of model XML typically replaces 15,000 lines of hand-written programming code, which allows for quick adaptation if the information system is not yet to the biologist's satisfaction. Each application generated with MOLGENIS comes with an optimized database back-end, user interfaces for biologists to manage and exploit their data, programming interfaces for bioinformaticians to script analysis tools in R, Java, SOAP, REST/JSON and RDF, a tab-delimited file format to ease upload and exchange of data, and detailed technical documentation. Existing databases can be quickly enhanced with MOLGENIS generated interfaces using the 'ExtractModel' procedure.

Conclusions: The MOLGENIS toolkit provides bioinformaticians with a simple model to quickly generate flexible web platforms for all possible genomic, molecular and phenotypic experiments with a richness of interfaces not provided by other tools. All the software and manuals are available free as LGPLv3 open source at http://www.molgenis.org.

Figures

Figure 1
Figure 1
Model-driven development. Many minor and major changes have to be written in software code before a ‘standard’ software infrastructure accommodates a particular research. Using ‘model-driven’ development methods a bioinformatician only needs to model what is needed for his experiment using a therefore optimized domain specific language (DSL). Generators quickly produce all the software logic to compose a full software infrastructure that accommodates these needs. When experimental needs change, a bioinformatician can (re)run the same generator with an adapted model file to quickly produce another variant of software infrastructure. This vastly reduces ‘time-to-research’ and enables bioinformaticians to quickly develop a suite of software infrastructures, with each variant accommodating a specific research task, while still on track to reuse, integrate and share the best standard features with other labs and bioinformaticians.
Figure 2
Figure 2
Example model. The detailed software needed for an experiment can be described in domain-specific language (DSL, left). The MOLGENIS generator reads the model and automatically produces the custom software infrastructure specified (right). The screenshot includes example data. See main text for a description of the numbers.
Figure 3
Figure 3
Reusable components. (A) shows finished and semi-finished components that provide reusable features for displaying screens (FormView and MenuView), handling user requests (Form- and MenuController), and reading and writing to the database (DataMapper). (B) shows components of a completed software variant as described in Figure 2. Only the ‘differences’ needed to be added using systematic variation mechanisms (dotted lines) such as inheritance or parameterization.
Figure 4
Figure 4
Example generator. MOLGENIS generators are implemented as templates. This example shows the generator for a database component (A). This template is applied to each <entity> in the model to generate many complete DataMappers that would otherwise need to be written by hand. (B) shows an example of the generated source files, in this case for <entity name="Experiment"> as described in Figure 1. The command $Name(entity) translates to the name of the entity (“Experiment”) and command ${csv($entity.Fields, x)} means that command ‘x’ is applied to each field of the entity and returned as a comma separated string (csv).
Figure 5
Figure 5
Expected output. Overview of a typical MOLGENIS application, in this case customized in EBI style. See main text for a description of the numbers.

Similar articles

See all similar articles

Cited by 28 articles

See all "Cited by" articles

References

    1. Swertz MA, Jansen RC. Beyond standardization: dynamic software infrastructures for systems biology. Nat Rev Genet. 2007;8:235–243. doi: 10.1038/nrg2048. - DOI - PubMed
    1. Stein LD. Towards a cyberinfrastructure for the biological sciences: progress, visions and challenges. Nat Rev Genet. 2008;9:678–688. doi: 10.1038/nrg2414. - DOI - PubMed
    1. Thorisson GA, Muilu J, Brookes AJ. Genotype-phenotype databases: challenges and solutions for the post-genomic era. Nat Rev Genet. 2009;10:9–18. doi: 10.1038/nrg2483. - DOI - PubMed
    1. Generic Model Organism Database (GMOD) http://gmod.org
    1. Open Bioinformatics Foundation (O|B|F) http://www.open-bio.org

Publication types

LinkOut - more resources

Feedback