Mean free energy topology for nucleotide sequences of varying composition based on secondary structure calculations

J Theor Biol. 1999 Nov 21;201(2):113-40. doi: 10.1006/jtbi.1999.1018.

Abstract

The mean free energy generated from the secondary structure of RNA sequences of varying length and composition has been studied by way of probability theory. The expected boundaries or maximal and minimal values of a given distribution are explored and a method for estimating error as a function of the number of shuffled sequences is also examined. For typical nucleotide sequences found in biologically active organisms, the mean free energy, free energy distributions and errors appear to be scalable in terms of a fixed set of algorithm-dependent parameters and the nucleotide composition of the particular sequence under evaluation. In addition, a general semi-analytical formula for predicting the mean free energy is proposed which, at least to first-order approximation, can be used to rapidly predict the mean free energy of any sequence length and composition of RNA. The general methodology appears to be algorithm independent. The results are expected to provide a reference point for certain types of analysis related to structure of RNA or DNA sequences and to assist in measuring the somewhat related matter of complexity in algorithm development. Some related applications are discussed.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Animals
  • Base Composition
  • Multivariate Analysis
  • Nucleic Acid Conformation
  • Probability
  • Protein Structure, Secondary*
  • RNA*

Substances

  • RNA