Proteins of the nucleic acid-binding proteins superfamily perform such functions as processing, transport, storage, stretching, translation, and degradation of RNA. It is one of the 16 superfamilies containing the OB-fold in protein structures. Here, we have analyzed the superfamily of nucleic acid-binding proteins (the number of sequences exceeds 200,000) and obtained that this superfamily prevalently consists of proteins containing the cold shock DNA-binding domain (ca. 131,000 protein sequences). Proteins containing the S1 domain compose 57% from the cold shock DNA-binding domain family. Furthermore, we have found that the S1 domain was identified mainly in the bacterial proteins (ca. 83%) compared to the eukaryotic and archaeal proteins, which are available in the UniProt database. We have found that the number of multiple repeats of S1 domain in the S1 domain-containing proteins depends on the taxonomic affiliation. All archaeal proteins contain one copy of the S1 domain, while the number of repeats in the eukaryotic proteins varies between 1 and 15 and correlates with the protein size. In the bacterial proteins, the number of repeats is no more than 6, regardless of the protein size. The large variation of the repeat number of S1 domain as one of the structural variants of the OB-fold is a distinctive feature of S1 domain-containing proteins. Proteins from the other families and superfamilies have either one OB-fold or change slightly the repeat numbers. On the whole, it can be supposed that the repeat number is a vital for multifunctional activity of the S1 domain-containing proteins. Proteins 2017; 85:602-613. © 2016 Wiley Periodicals, Inc.
Keywords: OB-fold; S1 domain; nucleic acid-binding proteins; structural repeats; taxonomic distribution.
© 2017 Wiley Periodicals, Inc.