An unconstrained reference sequence facilitates the detection of selection. In Drosophila, sequence variation in short introns seems to be least influenced by selection and dominated by mutation and drift. Here, we test this with genome-wide sequences using an African population (Malawi) of D. melanogaster and data from the related outgroup species D. simulans, D. sechellia, D. erecta and D. yakuba. The distribution of mutations deviates from equilibrium, and the content of A and T (AT) nucleotides shows an excess of variance among introns. We explain this by a complex mutational pattern: a shift in mutational bias towards AT, leading to a slight nonequilibrium in base composition and context-dependent mutation rates, with G or C (GC) sites mutating most frequently in AT-rich introns. By comparing the corresponding allele frequency spectra of AT-rich vs. GC-rich introns, we can rule out the influence of directional selection or biased gene conversion on the mutational pattern. Compared with neutral equilibrium expectations, polymorphism spectra show an excess of low frequency and a paucity of intermediate frequency variants, irrespective of the direction of mutation. Combining the information from different outgroups with the polymorphism data and using a generalized linear model, we find evidence for shared ancestral polymorphism between D. melanogaster and D. simulans, D. sechellia, arguing against a bottleneck in D. melanogaster. Generally, we find that short introns can be used as a neutral reference on a genome-wide level, if the spatially and temporally varying mutational pattern is accounted for.
© 2012 The Authors. Journal of Evolutionary Biology © 2012 European Society For Evolutionary Biology.