Ovarian cancer is one of the most common gynecologic malignancies and is the 5th leading cause of cancer mortality in women in the United States. Understanding the biology and molecular pathogenesis of ovarian epithelial tumors is key to developing improved prognostic indicators and effective therapies. The selection of ovarian serous carcinomas as one of the three cancer types for extensive genomic and proteomic characterization of The Cancer Genome Atlas (TCGA) project offers an important opportunity to extend our knowledge of ovarian cancer. The data portal includes molecular characterization, high throughput sequencing, and clinical data. Models to determine which of these genes act as "key drivers" of ovarian carcinogenesis and which are innocent "passengers" are needed. Standard statistical approaches often fail to differentiate between these driver and passenger genes, given that the correlation between sets of genes or genes and endpoints alone does not establish causality. As contrasted to basic correlations analyses, biological network models offer the ability to resolve causality by elucidating the directional linkages between genetics, molecular characterizations of the system, and clinical measures. This article describes the use of a novel, supercomputer-driven approach named REFS to learn network models directly from the TGCA ovarian cancer data set and simulate these models to learn the "key drivers" of ovarian carcinogenesis. The model can be validated by out-of-sample testing, and may provide a powerful new tool for ovarian cancer research.
Copyright 2009 Elsevier Inc. All rights reserved.