Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 473 (7346), 174-80

Enterotypes of the Human Gut Microbiome

Manimozhiyan Arumugam  1 Jeroen RaesEric PelletierDenis Le PaslierTakuji YamadaDaniel R MendeGabriel R FernandesJulien TapThomas BrulsJean-Michel BattoMarcelo BertalanNatalia BorruelFrancesc CasellasLeyden FernandezLaurent GautierTorben HansenMasahira HattoriTetsuya HayashiMichiel KleerebezemKen KurokawaMarion LeclercFlorence LevenezChaysavanh ManichanhH Bjørn NielsenTrine NielsenNicolas PonsJulie PoulainJunjie QinThomas Sicheritz-PontenSebastian TimsDavid TorrentsEdgardo UgarteErwin G ZoetendalJun WangFrancisco GuarnerOluf PedersenWillem M de VosSøren BrunakJoel DoréMetaHIT ConsortiumMaría AntolínFrançois ArtiguenaveHervé M BlottiereMathieu AlmeidaChristian BrechotCarlos CaraChristian ChervauxAntonella CultroneChristine DelormeGérard DenariazRozenn DervynKonrad U FoerstnerCarsten FrissMaarten van de GuchteEric GuedonFlorence HaimetWolfgang HuberJohan van Hylckama-VliegAlexandre JametCatherine JusteGhalia KaciJan KnolOmar LakhdariSeverine LayecKarine Le RouxEmmanuelle MaguinAlexandre MérieuxRaquel Melo MinardiChristine M'riniJean MullerRaish OozeerJulian ParkhillPierre RenaultMaria RescignoNicolas SanchezShinichi SunagawaAntonio TorrejonKeith TurnerGaetana VandemeulebrouckEncarna VarelaYohanan WinogradskyGeorg ZellerJean WeissenbachS Dusko EhrlichPeer Bork

Enterotypes of the Human Gut Microbiome

Manimozhiyan Arumugam et al. Nature.

Erratum in

  • Nature. 2011 Jun 30;474(7353):666
  • Nature. 2014 Feb 27;506(7489):516


Our knowledge of species and functional composition of the human gut microbiome is rapidly increasing, but it is still based on very few cohorts and little is known about variation across the world. By combining 22 newly sequenced faecal metagenomes of individuals from four countries with previously published data sets, here we identify three robust clusters (referred to as enterotypes hereafter) that are not nation or continent specific. We also confirmed the enterotypes in two published, larger cohorts, indicating that intestinal microbiota variation is generally stratified, not continuous. This indicates further the existence of a limited number of well-balanced host-microbial symbiotic states that might respond differently to diet and drug intake. The enterotypes are mostly driven by species composition, but abundant molecular functions are not necessarily provided by abundant species, highlighting the importance of a functional analysis to understand microbial communities. Although individual host properties such as body mass index, age, or gender cannot explain the observed enterotypes, data-driven marker genes or functional modules can be identified for each of these host properties. For example, twelve genes significantly correlate with age and three functional modules with the body mass index, hinting at a diagnostic potential of microbial markers.


Fig. 1
Fig. 1. Functional and phylogenetic profiles of human gut microbiome
(a) Simulation of the detection of distinct orthologous groups (OGs) when increasing the number of individuals (samples). Complete genomes were classified by habitat-information and the OGs divided into those that occur in known gut-species (red) and those that have not yet associated to gut (blue). The former are close to saturation when sampling 35 individuals (excluding infants) whereas functions from non-gut (probably rare and transient) species are not. (b) Genus abundance variation box plot for the 30 most abundant genera as determined by read abundance. Genera are colored by their respective phylum (see inset for color key). Inset: phylum abundance box plot. Genus and phylum level abundances were measured using reference genome based mapping with 85% and 65% sequence similarity cutoffs. Unclassified genera under a higher rank are marked by asterisks. (c) Orthologous group (OG) abundance variation box plot for the 30 most abundant OGs as determined by assignment to eggNOG. OGs are colored by their respective functional category (see inset for color key). Inset: abundance box plot of 24 functional categories.
Fig. 2
Fig. 2. Phylogenetic differences between enterotypes
Between class analysis, which visualizes results from Principal Component Analysis and clustering, of the genus compositions of (a) 33 Sanger metagenomes estimated by mapping the metagenome reads to 1511 reference genome sequences using an 85% similarity threshold, (b) Danish subset containing 85 metagenomes from a published Illumina dataset and (c) 154 pyrosequencing-based 16S sequences reveal three robust clusters that we call enterotypes. Two principal components are plotted using the ade4 package in R with each sample represented by a filled circle. The center of gravity for each cluster is marked by a rectangle and the colored ellipse covers 67% of the samples belonging to the cluster. (d) Abundances of the main contributors of each enterotype from the Sanger metagenomes. (e) Co-occurrence networks of the three enterotypes from the Sanger metagenomes. Unclassified genera under a higher rank are marked by asterisks in (b) and (e).
Fig. 3
Fig. 3. Functional differences between enterotypes
(a) Between class analysis (see Fig. 2) of orthologous group (OG) abundances showing only minor disagreements with enterotypes (transparent circles indicate the differing samples). The blue cloud represents the local density estimated from the coordinates of OGs; positions of selected OGs are highlighted. (b) Four enzymes in the biotin biosynthesis pathway (COG0132, COG0156, COG0161 and COG0502) are overrepresented in enterotype 1. (c) Four enzymes in the thiamine biosynthesis pathway (COG0422, COG0351, COG0352 and COG0611) are overrepresented in enterotype 2. (d) Six enzymes in the heme biosynthesis pathway (COG0007, COG0276, COG407, COG0408, COG0716 and COG1648) are overrepresented in enterotype 3.
Fig. 4
Fig. 4. Correlations with host properties
(a) Pairwise correlation of RNA polymerase facultative sigma24 subunit (COG1595) with age (p=0.03, rho=−0.59). (b) Pairwise correlation of SusD, a family of proteins that bind glycan molecules before they are transported into the cell, and body mass index (p=0.27, rho=−0.29, weak correlation). (c) Multiple OGs (COG0085, COG0086, COG0438 and COG0739; see Supplementary Table 18) significantly correlating with age when combined into a linear model (see Supplementary Methods Section 13 and ref. for details; p= 2.75e-05, adjusted R2=0.57). (d) Two modules, ATPase complex and ectoine biosynthesis (M00051), significantly correlating with BMI when combined into a linear model (p= 6.786e-06, adjusted R2=0.82).

Similar articles

See all similar articles

Cited by 1,626 articles

See all "Cited by" articles

Publication types