Identification of new human cadherin genes using a combination of protein motif search and gene finding methods

J Mol Biol. 2004 Mar 19;337(2):307-17. doi: 10.1016/j.jmb.2004.01.026.

Abstract

We have combined protein motif search and gene finding methods to identify genes encoding proteins containing specific domains. Particularly, we have focused on finding new human genes of the cadherin superfamily proteins, which represent a major group of cell-cell adhesion receptors contributing to embryonic neuronal morphogenesis. Models for three cadherin protein motifs were generated from over 100 already annotated cadherin domains and used to search the complete translated human genome. The genomic sequence regions containing motif "hits" were analyzed by eukaryotic GeneMark.hmm to identify the exon-intron structure of new genes. Three new genes CDH-J, PCDH-J and FAT-J were found. The predicted proteins PCDH-J and FAT-J were classified into protocadherin and FAT-like subfamilies, respectively, based on the number and organization of cadherin domains and presence of subfamily-specific conserved amino acid residues. Expression of FAT-J was shown in almost all tested tissues. The exon-intron organization of CDH-J was experimentally verified by PCR with specifically designed primers and its tissue-specific expression was demonstrated. The described methodology can be applied to discover new genes encoding proteins from families with well-characterized structural and functional domains.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Amino Acid Motifs
  • Amino Acid Sequence
  • Base Sequence
  • Cadherins / chemistry*
  • Cadherins / genetics*
  • Consensus Sequence
  • DNA Primers / genetics
  • Expressed Sequence Tags
  • Genome, Human
  • Humans
  • Molecular Sequence Data
  • Multigene Family
  • Protein Structure, Tertiary
  • Sequence Alignment / methods
  • Sequence Homology, Amino Acid

Substances

  • Cadherins
  • DNA Primers

Associated data

  • GENBANK/AY354497
  • GENBANK/AY354498
  • GENBANK/AY356402