Prediction and analysis of retinoblastoma related genes through gene ontology and KEGG

Biomed Res Int. 2013:2013:304029. doi: 10.1155/2013/304029. Epub 2013 Aug 13.

Abstract

One of the most important and challenging problems in biomedicine is how to predict the cancer related genes. Retinoblastoma (RB) is the most common primary intraocular malignancy usually occurring in childhood. Early detection of RB could reduce the morbidity and promote the probability of disease-free survival. Therefore, it is of great importance to identify RB genes. In this study, we developed a computational method to predict RB related genes based on Dagging, with the maximum relevance minimum redundancy (mRMR) method followed by incremental feature selection (IFS). 119 RB genes were compiled from two previous RB related studies, while 5,500 non-RB genes were randomly selected from Ensemble genes. Ten datasets were constructed based on all these RB and non-RB genes. Each gene was encoded with a 13,126-dimensional vector including 12,887 Gene Ontology enrichment scores and 239 KEGG enrichment scores. Finally, an optimal feature set including 1061 GO terms and 8 KEGG pathways was obtained. Analysis showed that these features were closely related to RB. It is anticipated that the method can be applied to predict the other cancer related genes as well.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • Data Mining / methods
  • Databases, Genetic*
  • Gene Ontology*
  • Genes, Neoplasm / genetics*
  • Genetic Markers / genetics*
  • Humans
  • Models, Genetic*
  • Neoplasm Proteins / genetics*
  • Retinoblastoma / genetics*

Substances

  • Genetic Markers
  • Neoplasm Proteins