Background: The extremely halophilic archaea are present worldwide in saline environments and have important biotechnological applications. Ten complete genomes of haloarchaea are now available, providing an opportunity for comparative analysis.
Methodology/principal findings: We report here the comparative analysis of five newly sequenced haloarchaeal genomes with five previously published ones. Whole genome trees based on protein sequences provide strong support for deep relationships between the ten organisms. Using a soft clustering approach, we identified 887 protein clusters present in all halophiles. Of these core clusters, 112 are not found in any other archaea and therefore constitute the haloarchaeal signature. Four of the halophiles were isolated from water, and four were isolated from soil or sediment. Although there are few habitat-specific clusters, the soil/sediment halophiles tend to have greater capacity for polysaccharide degradation, siderophore synthesis, and cell wall modification. Halorhabdus utahensis and Haloterrigena turkmenica encode over forty glycosyl hydrolases each, and may be capable of breaking down naturally occurring complex carbohydrates. H. utahensis is specialized for growth on carbohydrates and has few amino acid degradation pathways. It uses the non-oxidative pentose phosphate pathway instead of the oxidative pathway, giving it more flexibility in the metabolism of pentoses.
Conclusions: These new genomes expand our understanding of haloarchaeal catabolic pathways, providing a basis for further experimental analysis, especially with regard to carbohydrate metabolism. Halophilic glycosyl hydrolases for use in biofuel production are more likely to be found in halophiles isolated from soil or sediment.