Tethering preferences of domain families co-occurring in multi-domain proteins

Mol Biosyst. 2013 Jul;9(7):1708-25. doi: 10.1039/c3mb25481j. Epub 2013 Apr 9.


Genomic data of several organisms have revealed the presence of a vast repertoire of multi-domain proteins. The role played by individual domains in a multi-domain protein has a profound influence on the overall function of the protein. In the present analysis an attempt has been made to better understand the tethering preferences of domain families that occur in multi-domain proteins. The analysis has been carried out on an exhaustive dataset of 2 961 898 sequences of proteins from 930 organisms, where 741 274 proteins are comprised of at least two domain families. For every domain family, the number of other domain families with which it co-occurs within a protein in this dataset has been enumerated and is referred to as the tethering number of the domain family. It was found that, in the general dataset, the AAA ATPase family and the family of Ser/Thr kinases have the highest tethering numbers of 450 and 444 respectively. Further analysis reveals significant correlation between the number of members in a family and its tethering number. Positive correlation was also observed for the extent of a sequence and functional diversity within a family and the tethering numbers of domain families. Domain families that are present ubiquitously in diverse organisms tend to have large tethering numbers, while organism/kingdom-specific families have low tethering numbers. Thus, the analysis uncovers how domain families recombine and evolve to give rise to multi-domain proteins.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Protein
  • Evolution, Molecular
  • Protein Interaction Domains and Motifs*
  • Protein Multimerization*
  • Proteins / chemistry*
  • Proteins / metabolism


  • Proteins