netboxr: Automated discovery of biological process modules by network analysis in R

PLoS One. 2020 Nov 2;15(11):e0234669. doi: 10.1371/journal.pone.0234669. eCollection 2020.


Summary: Large-scale sequencing projects, such as The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC), have generated high throughput sequencing and molecular profiling data sets, but it is still challenging to identify potentially causal changes in cellular processes in cancer as well as in other diseases in an automated fashion. We developed the netboxr package written in the R programming language, which makes use of the NetBox algorithm to identify candidate cancer-related functional modules. The algorithm makes use of a data-driven, network-based approach that combines prior knowledge with a network clustering algorithm, obviating the need for and the limitation of independently curated functionally labeled gene sets. The method can combine multiple data types, such as mutations and copy number alterations, leading to more reliable identification of functional modules. We make the tool available in the Bioconductor R ecosystem for applications in cancer research and cell biology.

Availability and implementation: The netboxr package is free and open-sourced under the GNU GPL-3 license R package available at

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Biomarkers, Tumor / genetics*
  • Gene Regulatory Networks*
  • Genome, Human*
  • Genomics / methods*
  • Humans
  • Metabolic Networks and Pathways
  • Neoplasms / genetics*
  • Programming Languages
  • Software*


  • Biomarkers, Tumor