Background: Big data is becoming ubiquitous in biology, and poses significant challenges in data analysis and interpretation. RNAi screening has become a workhorse of functional genomics, and has been applied, for example, to identify host factors involved in infection for a panel of different viruses. However, the analysis of data resulting from such screens is difficult, with often low overlap between hit lists, even when comparing screens targeting the same virus. This makes it a major challenge to select interesting candidates for further detailed, mechanistic experimental characterization.
Results: To address this problem we propose an integrative bioinformatics pipeline that allows for a network based meta-analysis of viral high-throughput RNAi screens. Initially, we collate a human protein interaction network from various public repositories, which is then subjected to unsupervised clustering to determine functional modules. Modules that are significantly enriched with host dependency factors (HDFs) and/or host restriction factors (HRFs) are then filtered based on network topology and semantic similarity measures. Modules passing all these criteria are finally interpreted for their biological significance using enrichment analysis, and interesting candidate genes can be selected from the modules.
Conclusions: We apply our approach to seven screens targeting three different viruses, and compare results with other published meta-analyses of viral RNAi screens. We recover key hit genes, and identify additional candidates from the screens. While we demonstrate the application of the approach using viral RNAi data, the method is generally applicable to identify underlying mechanisms from hit lists derived from high-throughput experimental data, and to select a small number of most promising genes for further mechanistic studies.
Keywords: Network analysis; RNAi screening; Virus-host interactions.