Genome-wide survey of DNA-binding proteins in Arabidopsis thaliana: analysis of distribution and functions

Nucleic Acids Res. 2013 Aug;41(15):7212-9. doi: 10.1093/nar/gkt505. Epub 2013 Jun 17.

Abstract

The interaction of proteins with their respective DNA targets is known to control many high-fidelity cellular processes. Performing a comprehensive survey of the sequenced genomes for DNA-binding proteins (DBPs) will help in understanding their distribution and the associated functions in a particular genome. Availability of fully sequenced genome of Arabidopsis thaliana enables the review of distribution of DBPs in this model plant genome. We used profiles of both structure and sequence-based DNA-binding families, derived from PDB and PFam databases, to perform the survey. This resulted in 4471 proteins, identified as DNA-binding in Arabidopsis genome, which are distributed across 300 different PFam families. Apart from several plant-specific DNA-binding families, certain RING fingers and leucine zippers also had high representation. Our search protocol helped to assign DNA-binding property to several proteins that were previously marked as unknown, putative or hypothetical in function. The distribution of Arabidopsis genes having a role in plant DNA repair were particularly studied and noted for their functional mapping. The functions observed to be overrepresented in the plant genome harbour DNA-3-methyladenine glycosylase activity, alkylbase DNA N-glycosylase activity and DNA-(apurinic or apyrimidinic site) lyase activity, suggesting their role in specialized functions such as gene regulation and DNA repair.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arabidopsis / genetics
  • Arabidopsis / metabolism*
  • Arabidopsis Proteins / genetics
  • Arabidopsis Proteins / metabolism*
  • DNA Repair Enzymes / genetics
  • DNA Repair Enzymes / metabolism
  • DNA Repair*
  • DNA, Plant / genetics
  • DNA, Plant / metabolism
  • DNA-Binding Proteins / genetics
  • DNA-Binding Proteins / metabolism*
  • Gene Expression Regulation, Plant
  • Genome, Plant*
  • Molecular Sequence Annotation
  • Multiprotein Complexes / genetics
  • Multiprotein Complexes / metabolism
  • Protein Binding
  • Proteome / analysis

Substances

  • Arabidopsis Proteins
  • DNA, Plant
  • DNA-Binding Proteins
  • Multiprotein Complexes
  • Proteome
  • DNA Repair Enzymes