A statistical analysis of physical map data for eight restriction enzymes covering nearly the entire genome of E. coli is presented. The methods of analysis are based on a top-down modeling approach which requires no knowledge of the statistical properties of the base sequence. For most enzymes, the distribution of mapped sites is found to be fairly homogeneous. Some heterogeneity in the distribution of sites is observed for the enzymes Pstl and HindIII. In addition, BamHI sites are found to be more evenly dispersed than we would expect for random placement and we speculate on a possible mechanism. A consistent departure from a uniform distribution, observed for each of the eight enzymes, is found to be due to a lack of closely spaced sites. We conclude from our analysis that this departure can be accounted for by deficiencies in the physical map data rather than non-random placement of actual restriction sites. Estimates of the numbers of sites missing from the map are given, based both on the map data itself and on the site frequencies in a sample of sequenced E. coli DNA. We conclude that 5 to 15% of the mapped sites represent multiple sites in the DNA sequence.