An improved dynamic programming algorithm is reported for RNA secondary structure prediction by free energy minimization. Thermodynamic parameters for the stabilities of secondary structure motifs are revised to include expanded sequence dependence as revealed by recent experiments. Additional algorithmic improvements include reduced search time and storage for multibranch loop free energies and improved imposition of folding constraints. An extended database of 151,503 nt in 955 structures? determined by comparative sequence analysis was assembled to allow optimization of parameters not based on experiments and to test the accuracy of the algorithm. On average, the predicted lowest free energy structure contains 73 % of known base-pairs when domains of fewer than 700 nt are folded; this compares with 64 % accuracy for previous versions of the algorithm and parameters. For a given sequence, a set of 750 generated structures contains one structure that, on average, has 86 % of known base-pairs. Experimental constraints, derived from enzymatic and flavin mononucleotide cleavage, improve the accuracy of structure predictions.
Copyright 1999 Academic Press.