LinguaPhylo: A probabilistic model specification language for reproducible phylogenetic analyses

PLoS Comput Biol. 2023 Jul 18;19(7):e1011226. doi: 10.1371/journal.pcbi.1011226. eCollection 2023 Jul.

Abstract

Phylogenetic models have become increasingly complex, and phylogenetic data sets have expanded in both size and richness. However, current inference tools lack a model specification language that can concisely describe a complete phylogenetic analysis while remaining independent of implementation details. We introduce a new lightweight and concise model specification language, 'LPhy', which is designed to be both human and machine-readable. A graphical user interface accompanies 'LPhy', allowing users to build models, simulate data, and create natural language narratives describing the models. These narratives can serve as the foundation for manuscript method sections. Additionally, we present a command-line interface for converting LPhy-specified models into analysis specification files (in XML format) compatible with the BEAST2 software platform. Collectively, these tools aim to enhance the clarity of descriptions and reporting of probabilistic models in phylogenetic studies, ultimately promoting reproducibility of results.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Language*
  • Models, Statistical
  • Phylogeny
  • Reproducibility of Results
  • Software*
  • User-Computer Interface

Grants and funding

AJD was supported by a James Cook Fellowship (JCF-UOA1901) from the Royal Society of New Zealand (https://www.royalsociety.org.nz). FKM was supported by Marsden grant 16-UOA-277 from the Royal Society of New Zealand and by National Science Foundation (https://www.nsf.gov) grant DEB-2040347. These funders played no role in the study design, data collection, analysis, decision to publish or preparation of the manuscript.