Background: Understanding the mechanism of transcriptional regulation remains an inspiring stage of molecular biology. Recently, in vitro protein-binding microarray experiments have greatly improved the understanding of transcription factor-DNA interaction. We present a method - MIL3D - which predicts in vitro transcription factor binding by multiple-instance learning with structural properties of DNA.
Results: Evaluation on in vitro data of twenty mouse transcription factors shows that our method outperforms a method based on simple-instance learning with DNA structural properties, and the widely used k-mer counting method, for nineteen out of twenty of the transcription factors. Our analysis showed that the MIL3D approach can utilize subtle structural similarities when a strong sequence consensus is not available.
Conclusion: Combining multiple-instance learning and structural properties of DNA has promising potential for studying biological regulatory networks.