Speeding up the core algorithm for the dual calculation of minimal cut sets in large metabolic networks

BMC Bioinformatics. 2020 Nov 9;21(1):510. doi: 10.1186/s12859-020-03837-3.

Abstract

Background: The concept of minimal cut sets (MCS) has become an important mathematical framework for analyzing and (re)designing metabolic networks. However, the calculation of MCS in genome-scale metabolic models is a complex computational problem. The development of duality-based algorithms in the last years allowed the enumeration of thousands of MCS in genome-scale networks by solving mixed-integer linear problems (MILP). A recent advancement in this field was the introduction of the MCS2 approach. In contrast to the Farkas-lemma-based dual system used in earlier studies, the MCS2 approach employs a more condensed representation of the dual system based on the nullspace of the stoichiometric matrix, which, due to its reduced dimension, holds promise to further enhance MCS computations.

Results: In this work, we introduce several new variants and modifications of duality-based MCS algorithms and benchmark their effects on the overall performance. As one major result, we generalize the original MCS2 approach (which was limited to blocking the operation of certain target reactions) to the most general case of MCS computations with arbitrary target and desired regions. Building upon these developments, we introduce a new MILP variant which allows maximal flexibility in the formulation of MCS problems and fully leverages the reduced size of the nullspace-based dual system. With a comprehensive set of benchmarks, we show that the MILP with the nullspace-based dual system outperforms the MILP with the Farkas-lemma-based dual system speeding up MCS computation with an averaged factor of approximately 2.5. We furthermore present several simplifications in the formulation of constraints, mainly related to binary variables, which further enhance the performance of MCS-related MILP. However, the benchmarks also reveal that some highly condensed formulations of constraints, especially on reversible reactions, may lead to worse behavior when compared to variants with a larger number of (more explicit) constraints and involved variables.

Conclusions: Our results further enhance the algorithmic toolbox for MCS calculations and are of general importance for theoretical developments as well as for practical applications of the MCS framework.

Keywords: Computational strain design; Constraint-based modeling; Duality; Elementary modes; Metabolic engineering; Metabolic networks; Stoichiometric modeling.

MeSH terms

  • Algorithms*
  • Corynebacterium / genetics
  • Escherichia coli / genetics
  • Genome
  • Metabolic Engineering
  • Metabolic Networks and Pathways / genetics*
  • Models, Biological
  • Saccharomyces cerevisiae / genetics