Incorporating Scale Uncertainty into Differential Expression Analyses Using ALDEx2

Curr Protoc. 2026 Feb;6(2):e70307. doi: 10.1002/cpz1.70307.

Abstract

Differential abundance or expression analyses are routinely performed on metagenomic, metatranscriptomic, and amplicon sequencing data. In such datasets, analysts usually have no information regarding the true scale (i.e., size) of the microbial community or sample under study, with inter-sample differences in sequencing depth instead being driven by technical variation rather than biological factors. Recent work has demonstrated that normalizations used in all analysis tools make incorrect assumptions about the biological scale of the system in question, leading to unacceptably high false-discovery rates in the output. To mitigate this, analysts can acknowledge and account for the uncertainty of the overall system scale during normalization by building scale models of the data-a feature that has been integrated into the ALDEx2 R package. Here, we provide reproducible examples that demonstrate how to incorporate scale models into differential expression analyses of RNA-seq data using bulk transcriptome and metatranscriptomic datasets, as well as the consequences of not doing so. We also show how to use the output of ALDEx2 to create high-level exploratory visualizations of their data through principal component analysis. © 2026 The Author(s). Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Using a simple scale model for differential expression analysis to avoid dual-cutoff P value/significance thresholds Basic Protocol 2: Implementing a full informed scale model to correct scale-related data asymmetry in differential expression analyses Basic Protocol 3: Visualizing ALDEx2 outputs using a compositional approach: Principal component analysis.

Keywords: ALDEx2; RNA‐seq; differential abundance; differential expression; metagenomics.

MeSH terms

  • Gene Expression Profiling* / methods
  • Humans
  • Metagenomics* / methods
  • Principal Component Analysis
  • RNA-Seq / methods
  • Sequence Analysis, RNA / methods
  • Software*
  • Transcriptome*
  • Uncertainty