Background: Gene expression is tightly regulated at both transcriptional and post-transcriptional levels. RNA-binding proteins are involved in post-transcriptional gene regulation events. They are involved in a variety of functions such as splicing, alternative splicing, nuclear import and export of mRNA, RNA stability and translation. There are several well-characterized RNA-binding motifs present in a whole genome, such as RNA recognition motif (RRM), KH domain, zinc-fingers etc. In the present study, we have investigated human genome for the presence of RRM-containing gene products starting from RRM domains in the Pfam (Protein family database) repository.
Results: In Pfam, seven families are recorded to contain RRM-containing proteins. We studied these families for their taxonomic representation, sequence features (identity, length, phylogeny) and structural properties (mapping conservation on the structures). We then examined the presence of RRM-containing gene products in Homo sapiens genome and identified 928 RRM-containing gene products. These were studied for their predicted domain architectures, biological processes, involvement in pathways, disease relevance and disorder content. RRM domains were observed to occur multiple times in a single polypeptide. However, there are 56 other co-existing domains involved in different regulatory functions. Further, functional enrichment analysis revealed that RRM-containing gene products are mainly involved in biological functions such as mRNA splicing and its regulation.
Conclusions: Our sequence analysis identified RRM-containing gene products in the human genome and provides insights into their domain architectures and biological functions. Since mRNA splicing and gene regulation are important in the cellular machinery, this analysis provides an early overview of genes that carry out these functions.