Screening of genes related to lung cancer caused by smoking with RNA-Seq

Eur Rev Med Pharmacol Sci. 2014;18(1):117-25.


Aim: To study the lung cancer caused by smoking from RNA-seq data and its mechanism at molecular level.

Materials and methods: We downloaded gene expression profile SRA (Sequence Read Archive) data from Gene Expression Omnibus database that included two samples: one was lung cancer tissue samples from smoker (GSM718710) and the other was from non-smoker (GSM718709). We analyzed differential expression of genes with packages software TopHat and Cufflinks, and did Gene Ontology (GO) function clustering of the differentially expressed genes by BLASTX. Then we utilized KEGG Orthology Based Annotation System (KOBAS) to make pathway annotation and do enrichment analysis of KEGG pathway. After that, we searched for probable alternative splicing of the selected differentially expressed genes and found closely-linked genes.

Results: we screened 1603 differentially expressed genes, most of which were involved in cellular processes. We also identified that the possible alternative splicing of gene FCGBP might have an important impact on lung cancer.

Conclusions: These findings in this study may help better understand the relationship between smoking and lung cancer pathogenesis.

MeSH terms

  • Alternative Splicing
  • Cell Adhesion Molecules / genetics*
  • Databases, Genetic
  • Gene Expression Profiling
  • Gene Expression Regulation, Neoplastic*
  • Humans
  • Lung / metabolism*
  • Lung Neoplasms / genetics*
  • Sequence Analysis, RNA
  • Smoking / genetics*


  • Cell Adhesion Molecules
  • FCGBP protein, human