Multigenome DNA sequence conservation identifies Hox cis-regulatory elements

Genome Res. 2008 Dec;18(12):1955-68. doi: 10.1101/gr.085472.108. Epub 2008 Nov 3.

Abstract

To learn how well ungapped sequence comparisons of multiple species can predict cis-regulatory elements in Caenorhabditis elegans, we made such predictions across the large, complex ceh-13/lin-39 locus and tested them transgenically. We also examined how prediction quality varied with different genomes and parameters in our comparisons. Specifically, we sequenced approximately 0.5% of the C. brenneri and C. sp. 3 PS1010 genomes, and compared five Caenorhabditis genomes (C. elegans, C. briggsae, C. brenneri, C. remanei, and C. sp. 3 PS1010) to find regulatory elements in 22.8 kb of noncoding sequence from the ceh-13/lin-39 Hox subcluster. We developed the MUSSA program to find ungapped DNA sequences with N-way transitive conservation, applied it to the ceh-13/lin-39 locus, and transgenically assayed 21 regions with both high and low degrees of conservation. This identified 10 functional regulatory elements whose activities matched known ceh-13/lin-39 expression, with 100% specificity and a 77% recovery rate. One element was so well conserved that a similar mouse Hox cluster sequence recapitulated the native nematode expression pattern when tested in worms. Our findings suggest that ungapped sequence comparisons can predict regulatory elements genome-wide.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Animals, Genetically Modified
  • Base Sequence / genetics*
  • Caenorhabditis elegans / genetics
  • Conserved Sequence / genetics
  • DNA, Helminth / genetics
  • DNA, Helminth / isolation & purification
  • Genes, Helminth*
  • Genes, Homeobox*
  • Homeodomain Proteins / genetics*
  • Homeodomain Proteins / metabolism
  • Molecular Sequence Data
  • Phylogeny
  • Regulatory Sequences, Nucleic Acid / genetics*
  • Sequence Analysis, DNA
  • Transgenes

Substances

  • DNA, Helminth
  • Homeodomain Proteins

Associated data

  • GENBANK/FJ362353