Background: Ontologies are useful in many branches of biomedical research. For instance, in the vaccine domain, the community-based Vaccine Ontology (VO) has been widely used to promote vaccine data standardization, integration, and computer-assisted reasoning. However, a major challenge in the VO has been to construct ontologies of vaccine functions, given incomplete vaccine knowledge and inconsistencies in how this knowledge is manually curated.
Results: In this study, we show that network-based analysis of vaccine-related networks can identify underlying structural information consistent with that captured by the VO, and commonalities in the vaccine adverse events for vaccines and for diseases to produce new hypotheses about pathomechanisms involving the vaccine and the disease status. First, a vaccine-vaccine network was inferred by applying a bipartite network projection strategy to the vaccine-disease network extracted from the Semantic MEDLINE database. In total, 76 vaccines and 573 relationships were identified to construct the vaccine network. The shortest paths between all pairs of vaccines were calculated within the vaccine network. The correlation between the shortest paths of vaccine pairs and their semantic similarities in the VO was then investigated. Second, a vaccine-gene network was also constructed. In this network, 4 genes were identified as hubs interacting with at least 3 vaccines, and 4 vaccines were identified as hubs associated with at least 3 genes. These findings correlate with existing knowledge and provide new hypotheses in the fundamental interaction mechanisms involving vaccines, diseases, and genes.
Conclusions: In this study, we demonstrated that a combinatorial analysis using a literature knowledgebase, semantic technology, and ontology is able to reveal important unidentified knowledge critical to biomedical research and public health and to generate testable hypotheses for future experimental verification. As the associations from Semantic MEDLINE remain incomplete, we expect to extend this work by (1) integrating additional association databases to complement Semantic MEDLINE knowledge, (2) extending the neighbor genes of vaccine-associated genes, and (3) assigning confidence weights to different types of associations or associations from different sources.