Application of a new probabilistic model for recognizing complex patterns in glycans

Bioinformatics. 2004 Aug 4:20 Suppl 1:i6-14. doi: 10.1093/bioinformatics/bth916.

Abstract

Motivation: The study of carbohydrate sugar chains, or glycans, has been one of slow progress mainly due to the difficulty in establishing standard methods for analyzing their structures and biosynthesis. Glycans are generally tree structures that are more complex than linear DNA or protein sequences, and evidence shows that patterns in glycans may be present that spread across siblings and into further regions that are not limited by the edges in the actual tree structure itself. Current models were not able to capture such patterns.

Results: We have applied a new probabilistic model, called probabilistic sibling-dependent tree Markov model (PSTMM), which is able to inherently capture such complex patterns of glycans. Not only is the ability to capture such patterns important in itself, but this also implies that PSTMM is capable of performing multiple tree structure alignments efficiently. We prove through experimentation on actual glycan data that this new model is extremely useful for gaining insight into the hidden, complex patterns of glycans, which are so crucial for the development and functioning of higher level organisms. Furthermore, we also show that this model can be additionally utilized as an innovative approach to multiple tree alignment, which has not been applied to glycan chains before. This extension on the usage of PSTMM may be a major step forward for not only the structural analysis of glycans, but it may consequently prove useful for discovering clues into their function.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Carbohydrate Sequence
  • Computer Simulation
  • Data Interpretation, Statistical
  • Models, Chemical*
  • Models, Statistical
  • Molecular Sequence Data
  • Pattern Recognition, Automated / methods*
  • Polysaccharides / chemistry*
  • Sequence Analysis / methods*

Substances

  • Polysaccharides