Ubiquitous selective constraints in the Drosophila genome revealed by a genome-wide interspecies comparison

Genome Res. 2006 Jul;16(7):875-84. doi: 10.1101/gr.5022906. Epub 2006 Jun 2.

Abstract

Non-coding DNA comprises approximately 80% of the euchromatic portion of the Drosophila melanogaster genome. Non-coding sequences are known to contain functionally important elements controlling gene expression, but the proportion of sites that are selectively constrained is still largely unknown. We have compared the complete D. melanogaster and Drosophila simulans genome sequences to estimate mean selective constraint (the fraction of mutations that are eliminated by selection) in coding and non-coding DNA by standardizing to substitution rates in putatively unconstrained sequences. We show that constraint is positively correlated with intronic and intergenic sequence length and is generally remarkably strong in non-coding DNA, implying that more than half of all point mutations in the Drosophila genome are deleterious. This fraction is also likely to be an underestimate if many substitutions in non-coding DNA are adaptively driven to fixation. We also show that substitutions in long introns and intergenic sequences are clustered, such that there is an excess of substitutions <8 bp apart and a deficit farther apart. These results suggest that there are blocks of constrained nucleotides, presumably involved in gene expression control, that are concentrated in long non-coding sequences. Furthermore, we infer that there is more than three times as much functional non-coding DNA as protein-coding DNA in the Drosophila genome. Most deleterious mutations therefore occur in non-coding DNA, and these may make an important contribution to a wide variety of evolutionary processes.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Base Sequence
  • DNA / genetics
  • DNA, Intergenic
  • Drosophila melanogaster / classification*
  • Drosophila melanogaster / genetics*
  • Evolution, Molecular
  • Gene Expression Regulation
  • Genome, Insect*
  • Introns
  • Molecular Sequence Data
  • Point Mutation
  • Selection, Genetic*
  • Sequence Analysis, DNA
  • Sequence Homology, Nucleic Acid
  • Species Specificity

Substances

  • DNA, Intergenic
  • DNA