Logicome Profiler: Exhaustive detection of statistically significant logic relationships from comparative omics data

PLoS One. 2020 May 1;15(5):e0232106. doi: 10.1371/journal.pone.0232106. eCollection 2020.

Abstract

Logic relationship analysis is a data mining method that comprehensively detects item triplets that satisfy logic relationships from a binary matrix dataset, such as an ortholog table in comparative genomics. Thanks to recent technological advancements, many binary matrix datasets are now being produced in genomics, transcriptomics, epigenomics, metagenomics, and many other fields for comparative purposes. However, regardless of presumed interpretability and importance of logic relationships, existing data mining methods are not based on the framework of statistical hypothesis testing. That means, the type-1 and type-2 error rates are neither controlled nor estimated. Here, we developed Logicome Profiler, which exhaustively detects statistically significant triplet logic relationships from a binary matrix dataset (Logicome means ome of logics). To test all item triplets in a dataset while avoiding false positives, Logicome Profiler adjusts a significance level by the Bonferroni or Benjamini-Yekutieli method for the multiple testing correction. Its application to an ocean metagenomic dataset showed that Logicome Profiler can effectively detect statistically significant triplet logic relationships among environmental microbes and genes, which include those among urea transporter, urease, and photosynthesis-related genes. Beyond omics data analysis, Logicome Profiler is applicable to various binary matrix datasets in general for finding significant triplet logic relationships. The source code is available at https://github.com/fukunagatsu/LogicomeProfiler.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Data Mining / methods*
  • Datasets as Topic*
  • Logic
  • Models, Statistical

Grants and funding

This work was supported by Japan Society for the Promotion of Science (https://www.jsps.go.jp/english/index.html) KAKENHI, Grant Number JP17H05605 and JP19K20395 to T.F. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.