Using OrthoMCL to assign proteins to OrthoMCL-DB groups or to cluster proteomes into new ortholog groups

Curr Protoc Bioinformatics. 2011 Sep:Chapter 6:6.12.1-6.12.19. doi: 10.1002/0471250953.bi0612s35.


OrthoMCL is an algorithm for grouping proteins into ortholog groups based on their sequence similarity. OrthoMCL-DB is a public database that allows users to browse and view ortholog groups that were pre-computed using the OrthoMCL algorithm. Version 4 of this database contained 116,536 ortholog groups clustered from 1,270,853 proteins obtained from 88 eukaryotic genomes, 16 archaean genomes, and 34 bacterial genomes. Future versions of OrthoMCL-DB will include more proteomes as more genomes are sequenced. Here, we describe how you can group your proteins of interest into ortholog clusters using two different means provided by the OrthoMCL system. The OrthoMCL-DB Web site has a tool for uploading and grouping a set of protein sequences, typically representing a proteome. This method maps the uploaded proteins to existing groups in OrthoMCL-DB. Alternatively, if you have proteins from a set of genomes that need to be grouped, you can download, install, and run the stand-alone OrthoMCL software.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Computational Biology / methods*
  • Databases, Genetic*
  • Proteins / classification*
  • Proteins / genetics
  • Proteome / classification*
  • Proteome / genetics
  • Software*
  • Species Specificity


  • Proteins
  • Proteome