How accurately is ncRNA aligned within whole-genome multiple alignments?

BMC Bioinformatics. 2007 Oct 26;8:417. doi: 10.1186/1471-2105-8-417.

Abstract

Background: Multiple alignment of homologous DNA sequences is of great interest to biologists since it provides a window into evolutionary processes. At present, the accuracy of whole-genome multiple alignments, particularly in noncoding regions, has not been thoroughly evaluated.

Results: We evaluate the alignment accuracy of certain noncoding regions using noncoding RNA alignments from Rfam as a reference. We inspect the MULTIZ 17-vertebrate alignment from the UCSC Genome Browser for all the human sequences in the Rfam seed alignments. In particular, we find 638 instances of chimeric and partial alignments to human noncoding RNA elements, of which at least 225 can be improved by straightforward means. As a byproduct of our procedure, we predict many novel instances of known ncRNA families that are suggested by the alignment.

Conclusion: MULTIZ does a fairly accurate job of aligning these genomes in these difficult regions. However, our experiments indicate that better alignments exist in some regions.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Base Sequence
  • Computer Simulation
  • Databases, Nucleic Acid
  • Decision Support Techniques*
  • Evaluation Studies as Topic
  • Genomics / methods
  • Humans
  • Multigene Family
  • Quality Control
  • RNA, Untranslated / analysis*
  • Sequence Alignment / methods*
  • Sequence Alignment / standards
  • Sequence Analysis, RNA / methods*
  • Sequence Analysis, RNA / standards
  • Software Validation*

Substances

  • RNA, Untranslated