Identification of sample mix-ups and mixtures in microbiome data in Diversity Outbred mice

G3 (Bethesda). 2021 Sep 2;jkab308. doi: 10.1093/g3journal/jkab308. Online ahead of print.

Abstract

In a Diversity Outbred mouse project with genotype data on 500 mice, including 297 with microbiome data, we identified three sets of sample mix-ups (two pairs and one trio) as well as at least 15 microbiome samples that appear to be mixtures of pairs of mice. The microbiome data consisted of shotgun sequencing reads from fecal DNA, used to characterize the gut microbial communities present in these mice. These sequence reads included sufficient reads derived from the host mouse to identify the individual. A number of microbiome samples appeared to contain a mixture of DNA from two mice. We describe a method for identifying sample mix-ups in such microbiome data, as well as a method for evaluating sample mixtures in this context.

Keywords: data cleaning; data diagnostics; multi-parent populations; quantitative trait loci; sample mislabeling.