Function prediction and analysis of mycobacterium tuberculosis hypothetical proteins

Int J Mol Sci. 2012;13(6):7283-7302. doi: 10.3390/ijms13067283. Epub 2012 Jun 13.

Abstract

High-throughput biology technologies have yielded complete genome sequences and functional genomics data for several organisms, including crucial microbial pathogens of humans, animals and plants. However, up to 50% of genes within a genome are often labeled "unknown", "uncharacterized" or "hypothetical", limiting our understanding of virulence and pathogenicity of these organisms. Even though biological functions of proteins encoded by these genes are not known, many of them have been predicted to be involved in key processes in these organisms. In particular, for Mycobacterium tuberculosis, some of these "hypothetical" proteins, for example those belonging to the Pro-Glu or Pro-Pro-Glu (PE/PPE) family, have been suspected to play a crucial role in the intracellular lifestyle of this pathogen, and may contribute to its survival in different environments. We have generated a functional interaction network for Mycobacterium tuberculosis proteins and used this to predict functions for many of its hypothetical proteins. Here we performed functional enrichment analysis of these proteins based on their predicted biological functions to identify annotations that are statistically relevant, and analysed and compared network properties of hypothetical proteins to the known proteins. From the statistically significant annotations and network information, we have tried to derive biologically meaningful annotations related to infection and disease. This quantitative analysis provides an overview of the functional contributions of Mycobacterium tuberculosis "hypothetical" proteins to many basic cellular functions, including its adaptability in the host system and its ability to evade the host immune response.

Keywords: function prediction; genome analysis; hypothetical protein; interaction network; tuberculosis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacterial Proteins / genetics
  • Bacterial Proteins / metabolism*
  • Genome, Bacterial / physiology*
  • Models, Biological*
  • Molecular Sequence Annotation*
  • Mycobacterium tuberculosis / genetics
  • Mycobacterium tuberculosis / metabolism*

Substances

  • Bacterial Proteins