The use of next-generation sequencing technologies is revolutionizing microbial ecology by allowing a deep phylogenetic coverage of tens to thousands of samples simultaneously. Double Principal Coordinates Analysis (DPCoA) is a multivariate method, developed in community ecology, able to integrate a distance matrix describing differences among species (e.g. phylogenetic distances) in the analysis of a species abundance matrix. This ordination technique has been used recently to describe microbial communities taking into account phylogenetic relatedness. In this work, we extend DPCoA to integrate the information of external variables measured on communities. The constrained Double Principal Coordinates Analysis (cDPCoA) is able to enforce a priori classifications to retrieve subtle differences and (or) remove the effect of confounding factors. We describe the main principles of this new approach and demonstrate its usefulness by providing application examples based on published 16S rRNA gene data sets.
Keywords: 16S; 18S; DPCoA; QIIME; Unifrac; microbial community analysis.
© 2014 John Wiley & Sons Ltd.