A fast and unbiased procedure to randomize ecological binary matrices with fixed row and column totals

Nat Commun. 2014 Jun 11;5:4114. doi: 10.1038/ncomms5114.

Abstract

A well-known problem in numerical ecology is how to recombine presence-absence matrices without altering row and column totals. A few solutions have been proposed, but all of them present some issues in terms of statistical robustness (that is, their capability to generate different matrix configurations with the same probability) and their performance (that is, the computational effort that they require to generate a null matrix). Here we introduce the 'Curveball algorithm', a new procedure that differs from existing methods in that it focuses rather on matrix information content than on matrix structure. We demonstrate that the algorithm can sample uniformly the set of all possible matrix configurations requiring a computational effort orders of magnitude lower than that required by available methods, making it possible to easily randomize matrices larger than 10(8) cells.