The microbes that inhabit particular environments must be able to perform molecular functions that provide them with a competitive advantage to thrive in those environments. As most molecular functions are performed by proteins and are conserved between related proteins, we can expect that organisms successful in a given environmental niche would contain protein families that are specific for functions that are important in that environment. For instance, the human gut is rich in polysaccharides from the diet or secreted by the host, and is dominated by Bacteroides, whose genomes contain highly expanded repertoire of protein families involved in carbohydrate metabolism. To identify other protein families that are specific to this environment, we investigated the distribution of protein families in the currently available human gut genomic and metagenomic data. Using an automated procedure, we identified a group of protein families strongly overrepresented in the human gut. These not only include many families described previously but also, interestingly, a large group of previously unrecognized protein families, which suggests that we still have much to discover about this environment. The identification and analysis of these families could provide us with new information about an environment critical to our health and well being.