An expressed sequence tag (EST) data mining strategy succeeding in the discovery of new G-protein coupled receptors

J Mol Biol. 2001 Mar 30;307(3):799-813. doi: 10.1006/jmbi.2001.4520.


We have developed a comprehensive expressed sequence tag database search method and used it for the identification of new members of the G-protein coupled receptor superfamily. Our approach proved to be especially useful for the detection of expressed sequence tag sequences that do not encode conserved parts of a protein, making it an ideal tool for the identification of members of divergent protein families or of protein parts without conserved domain structures in the expressed sequence tag database. At least 14 of the expressed sequence tags found with this strategy are promising candidates for new putative G-protein coupled receptors. Here, we describe the sequence and expression analysis of five new members of this receptor superfamily, namely GPR84, GPR86, GPR87, GPR90 and GPR91. We also studied the genomic structure and chromosomal localization of the respective genes applying in silico methods. A cluster of six closely related G-protein coupled receptors was found on the human chromosome 3q24-3q25. It consists of four orphan receptors (GPR86, GPR87, GPR91, and H963), the purinergic receptor P2Y1, and the uridine 5'-diphosphoglucose receptor KIAA0001. It seems likely that these receptors evolved from a common ancestor and therefore might have related ligands. In conclusion, we describe a data mining procedure that proved to be useful for the identification and first characterization of new genes and is well applicable for other gene families.

MeSH terms

  • Amino Acid Motifs
  • Amino Acid Sequence
  • Animals
  • Chromosomes, Human, Pair 3 / genetics
  • Cloning, Molecular / methods*
  • Conserved Sequence
  • Databases as Topic
  • Exons / genetics
  • Expressed Sequence Tags*
  • Gene Expression Profiling
  • Heterotrimeric GTP-Binding Proteins / metabolism*
  • Humans
  • Introns / genetics
  • Ligands
  • Mice
  • Molecular Sequence Data
  • Multigene Family / genetics
  • Phylogeny
  • Physical Chromosome Mapping
  • RNA, Messenger / analysis
  • RNA, Messenger / genetics
  • Receptors, Cell Surface / chemistry
  • Receptors, Cell Surface / genetics*
  • Receptors, Cell Surface / metabolism*
  • Receptors, G-Protein-Coupled / chemistry
  • Receptors, G-Protein-Coupled / genetics*
  • Receptors, G-Protein-Coupled / metabolism*
  • Receptors, Purinergic P2 / chemistry
  • Receptors, Purinergic P2 / genetics
  • Receptors, Purinergic P2 / metabolism
  • Receptors, Purinergic P2Y1
  • Sequence Alignment
  • Uridine Diphosphate Glucose / metabolism


  • Ligands
  • P2RY1 protein, human
  • P2ry1 protein, mouse
  • RNA, Messenger
  • Receptors, Cell Surface
  • Receptors, G-Protein-Coupled
  • Receptors, Purinergic P2
  • Receptors, Purinergic P2Y1
  • SUCNR1 protein, human
  • Heterotrimeric GTP-Binding Proteins
  • Uridine Diphosphate Glucose

Associated data

  • GENBANK/AF237762
  • GENBANK/AF237763
  • GENBANK/AF272948
  • GENBANK/AF295365
  • GENBANK/AF295366
  • GENBANK/AF295367
  • GENBANK/AF295368