Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Mar 15;7:473.
doi: 10.1038/msb.2011.6.

Toward Molecular Trait-Based Ecology Through Integration of Biogeochemical, Geographical and Metagenomic Data

Free PMC article

Toward Molecular Trait-Based Ecology Through Integration of Biogeochemical, Geographical and Metagenomic Data

Jeroen Raes et al. Mol Syst Biol. .
Free PMC article


Using metagenomic 'parts lists' to infer global patterns on microbial ecology remains a significant challenge. To deduce important ecological indicators such as environmental adaptation, molecular trait dispersal, diversity variation and primary production from the gene pool of an ecosystem, we integrated 25 ocean metagenomes with geographical, meteorological and geophysicochemical data. We find that climatic factors (temperature, sunlight) are the major determinants of the biomolecular repertoire of each sample and the main limiting factor on functional trait dispersal (absence of biogeographic provincialism). Molecular functional richness and diversity show a distinct latitudinal gradient peaking at 20° N and correlate with primary production. The latter can also be predicted from the molecular functional composition of an environmental sample. Together, our results show that the functional community composition derived from metagenomes is an important quantitative readout for molecular trait-based biogeography and ecology.

Conflict of interest statement

The authors declare that they have no conflict of interest.


Figure 1
Figure 1
Correlations between metabolic pathway abundances and environmental conditions deduced from the ocean samples in this study, at various levels of model complexity (see Materials and methods): (A) ‘One-to-one' pairwise correlation (P=0.001) between the abundance of photosystem I genes with average monthly water temperature. (B) ‘One-to-many' linear model of average monthly water temperature, phosphate concentration and hours of sunlight correlating with carbon fixation gene abundance (R2=0.70). (C) ‘Many-to-many' regularized canonical correlation analysis ordination plot showing the correlation between all environmental variables (text labels; see Materials and methods) and pathway modules (colored dots). The distance between two variables on the plot and their distance from the center point indicates the strength of their correlation and their contribution to explaining the global correlation (i.e., their structural correlation in each dimension given on the respective axes: first dimension, vertical, second, horizontal; see Materials and methods). The overall canonical correlation is high (canonical correlation=0.944 in the first dimension), and the two first dimensions explain 62 and 22% of the total environmental and metagenomic variation, respectively, emphasizing the strong correlation between the climatic factors and functional community composition on the first dimension. Module colors indicate their broad functional classes: yellow, amino acid metabolism; orange, central metabolism; red, energy metabolism, dark green, glycan metabolism; cyan, lipid metabolism; purple, metabolism of other molecules; blue, nucleotide metabolism; brown, replication and repair; light green, transcription; pink, translation; gray, transport system. Highlighted modules are described in more detail in the text. mld, mixed layer depth.
Figure 2
Figure 2
The role of environment in the biogeography of functional traits. (A) Coupling of metagenomic distance between samples (measured using KEGG metabolic pathway composition; see Materials and methods) with difference in climatological conditions, identifying climate as a primary determinant of function dispersion. (B) (Partial) Mantel tests (see Materials and methods) showing that this increase is not due to indirect effects, such as the similarity in environmental conditions between geographically close samples.
Figure 3
Figure 3
Variation of functional richness and diversity, and its coupling to primary production. (A) Global view of primary production for the sampled region (ocean coloring) and functional richness for GOS sampling locations used in this study (3D bars), showing the peak in richness at ±20 degrees North (two outliers were removed for visualization purposes, but are present in (B), showing the full, quantitative data plotted against latitude), (C) functional diversity negatively correlates with primary production (Spearman's ρ=−0.49; P=0.01). Trend lines are Lowess fitted lines with smoothing parameter f=0.7.

Similar articles

See all similar articles

Cited by 51 articles

See all "Cited by" articles


    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215: 403–410 - PubMed
    1. Arrigo KR (2005) Marine microorganisms and global nutrient cycles. Nature 437: 349–355 - PubMed
    1. Baas Becking LGM (1934) Geobiologie of inleiding tot de milieukunde. The Hague, The Netherlands: W.P. Van Stockum & Zoon
    1. Behrenfeld MJ, O'Malley RT, Siegel DA, McClain CR, Sarmiento JL, Feldman GC, Milligan AJ, Falkowski PG, Letelier RM, Boss ES (2006) Climate-driven trends in contemporary ocean productivity. Nature 444: 752–755 - PubMed
    1. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc B 57: 289–300

Publication types

LinkOut - more resources