Taxonomic distribution, repeats, and functions of the S1 domain-containing proteins as members of the OB-fold family

Proteins. 2017 Apr;85(4):602-613. doi: 10.1002/prot.25237. Epub 2017 Feb 3.


Proteins of the nucleic acid-binding proteins superfamily perform such functions as processing, transport, storage, stretching, translation, and degradation of RNA. It is one of the 16 superfamilies containing the OB-fold in protein structures. Here, we have analyzed the superfamily of nucleic acid-binding proteins (the number of sequences exceeds 200,000) and obtained that this superfamily prevalently consists of proteins containing the cold shock DNA-binding domain (ca. 131,000 protein sequences). Proteins containing the S1 domain compose 57% from the cold shock DNA-binding domain family. Furthermore, we have found that the S1 domain was identified mainly in the bacterial proteins (ca. 83%) compared to the eukaryotic and archaeal proteins, which are available in the UniProt database. We have found that the number of multiple repeats of S1 domain in the S1 domain-containing proteins depends on the taxonomic affiliation. All archaeal proteins contain one copy of the S1 domain, while the number of repeats in the eukaryotic proteins varies between 1 and 15 and correlates with the protein size. In the bacterial proteins, the number of repeats is no more than 6, regardless of the protein size. The large variation of the repeat number of S1 domain as one of the structural variants of the OB-fold is a distinctive feature of S1 domain-containing proteins. Proteins from the other families and superfamilies have either one OB-fold or change slightly the repeat numbers. On the whole, it can be supposed that the repeat number is a vital for multifunctional activity of the S1 domain-containing proteins. Proteins 2017; 85:602-613. © 2016 Wiley Periodicals, Inc.

Keywords: OB-fold; S1 domain; nucleic acid-binding proteins; structural repeats; taxonomic distribution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Archaea / classification
  • Archaea / genetics
  • Archaea / metabolism
  • Archaeal Proteins / chemistry*
  • Archaeal Proteins / genetics
  • Bacteria / classification
  • Bacteria / genetics
  • Bacteria / metabolism
  • Bacterial Proteins / chemistry*
  • Bacterial Proteins / genetics
  • DNA-Binding Proteins / chemistry*
  • DNA-Binding Proteins / genetics
  • Databases, Protein
  • Eukaryota / classification
  • Eukaryota / genetics
  • Eukaryota / metabolism
  • Heat-Shock Proteins / chemistry*
  • Heat-Shock Proteins / genetics
  • Phylogeny
  • Protein Domains
  • Protein Interaction Domains and Motifs
  • Protein Structure, Secondary
  • RNA-Binding Proteins / chemistry*
  • RNA-Binding Proteins / genetics
  • Repetitive Sequences, Amino Acid


  • Archaeal Proteins
  • Bacterial Proteins
  • DNA-Binding Proteins
  • Heat-Shock Proteins
  • RNA-Binding Proteins