We demonstrate that interspecific sequence conservation can provide a systematic guide to the identification of functional cis-regulatory elements within a large expanse of genomic DNA. The test was carried out on the otx gene of Strongylocentrotus purpuratus. This gene plays a major role in the gene regulatory network that underlies endomesoderm specification in the embryo. The cis-regulatory organization of the otx gene is expected to be complex, because the gene has three different start sites (X. Li, C.-K. Chuang, C.-A. Mao, L. M. Angerer, and W. H. Klein, 1997, Dev. Biol. 187, 253-266), and it is expressed in many different spatial domains of the embryo. BAC recombinants containing the otx gene were isolated from Strongylocentrotus purpuratus and Lytechinus variegatus libraries, and the ordered sequence of these BACs was obtained and annotated. Sixty kilobases of DNA flanking the gene, and included in the BAC sequence from both species, were scanned computationally for short conserved sequence elements. For this purpose, we used a newly constructed software package assembled in our laboratory, "FamilyRelations." This tool allows detection of sequence similarities above a chosen criterion within sliding windows set at 20-50 bp. Seventeen partially conserved regions, most a few hundred base pairs long, were amplified from the S. purpuratus BAC DNA by PCR, inserted in an expression vector driving a CAT reporter, and tested for cis-regulatory activity by injection into fertilized S. purpuratus eggs. The regulatory activity of these constructs was assessed by whole-mount in situ hybridization (WMISH) using a probe against CAT mRNA. Of the 17 constructs, 11 constructs displayed spatially restricted regulatory activity, and 6 were inactive in this test. The domains within which the cis-regulatory constructs were expressed are approximately consistent with results from a WMISH study on otx expression in the embryo, in which we used probes specific for the mRNAs generated from each of the three transcription start sites. Four separate cis-regulatory elements that specifically produce endomesodermal expression were identified, as well as ubiquitously active elements, and ectoderm-specific elements. We confirm predictions from other work with respect to target sites for specific transcription factors within the elements that express in the endoderm.
(c) 2002 Elsevier Science (USA).