In Arabidopsis thaliana, 1% of the genome codes for a novel protein family unique to plants

Plant Mol Biol. 2000 Mar;42(4):603-13. doi: 10.1023/a:1006352315928.

Abstract

In the sequences released by the Arabidopsis Genome Initiative (AGI), we discovered a new and unexpectedly large family of orphan genes (127 genes by 01.08.99), named AtPCMP. The distribution of the AtPCMP genes on the five chromosomes suggests that the genome of Arabidopsis thaliana contains more than 200 genes of this family (1% of the whole genome). The deduced AtPCMP proteins are characterized by a surprising combinatorial organization of sequence motifs. The amino-terminal domain is made of a succession of three conserved motifs which generate an important diversity. These proteins are classified into three subfamilies based on the length and nature of their carboxy-terminal domain constituted by 1-6 motifs. All the motifs characterized have an important level of conservation in both sequence and spacing. A specific signature of this large family is defined. The presence of ESTs in databases and the detection of clones in A. thaliana cDNA libraries indicate that most of the genes of this family are expressed. The absence of similar sequences outside the plant kingdom strongly suggests that this unusually large orphan family is unique to plants. Features, the genesis, the potential function and the evolution of this plant combinatorial and modular protein family are discussed.

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Arabidopsis / genetics*
  • Chromosome Mapping
  • DNA, Complementary / chemistry
  • DNA, Complementary / genetics
  • Databases, Factual
  • Gene Expression
  • Genome, Plant*
  • Molecular Sequence Data
  • Multigene Family*
  • Plant Proteins / classification
  • Plant Proteins / genetics*
  • RNA, Plant / genetics
  • RNA, Plant / metabolism
  • Sequence Analysis, DNA
  • Tissue Distribution

Substances

  • DNA, Complementary
  • Plant Proteins
  • RNA, Plant

Associated data

  • GENBANK/AJ006040
  • GENBANK/AJ006041
  • GENBANK/AJ006042
  • GENBANK/AJ006043