Motivation: Pairwise experimental perturbation is increasingly used to probe gene and protein function because these studies offer powerful insight into the activity and regulation of biological systems. Symmetric two-dimensional datasets, such as pairwise genetic interactions are amenable to an optimally designed measurement procedure because of the equivalence of cases and conditions where fewer experimental measurements may be required to extract the underlying structure.
Results: We show that optimal experimental design can provide improvements in efficiency when collecting data in an iterative manner. We develop a method built on a statistical clustering model for symmetric data and the Fisher information uncertainty estimates, and we also provide simple heuristic approaches that have comparable performance. Using yeast epistatic miniarrays as an example, we show that correct assignment of the major subnetworks could be achieved with <50% of the measurements in the complete dataset. Optimization is likely to become critical as pairwise functional studies extend to more complex mammalian systems where all by all experiments are currently intractable.