Protein-based analysis of alternative splicing in the human genome

Proc IEEE Comput Soc Bioinform Conf. 2002:1:118-24.


Understanding the functional significance of alternative splicing and other mechanisms that generate RNA transcript diversity is an important challenge facing modern-day molecular biology. Using homology-based, protein sequence analysis methods, it should be possible to investigate how transcript diversity impacts protein structure and function. To test this, a data mining technique ("DiffHit") was developed to identify and catalog genes producing protein isoforms which exhibit distinct profiles of conserved protein motifs. We found that out of a test set of over 1,300 alternatively spliced genes with solved genomic structure, over 30% exhibited a differential profile of conserved InterPro and/or Blocks protein motifs across distinct isoforms. These results suggest that motif databases such as Blocks and InterPro are potentially useful tools for investigating how alternative transcript structure affects gene function.

Publication types

  • Evaluation Study

MeSH terms

  • Algorithms
  • Alternative Splicing / genetics*
  • Database Management Systems
  • Databases, Protein*
  • Evolution, Molecular
  • Gene Expression Profiling / methods
  • Genome, Human*
  • Humans
  • Information Storage and Retrieval / methods*
  • Protein Isoforms / chemistry
  • Protein Isoforms / genetics
  • Proteome / chemistry
  • Proteome / genetics*
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*
  • Sequence Homology, Amino Acid
  • Transcription, Genetic / genetics


  • Protein Isoforms
  • Proteome