An evaluation of public genomic references for mapping RNA-Seq data from Chinese hamster ovary cells

Biotechnol Bioeng. 2015 Nov;112(11):2412-6. doi: 10.1002/bit.25649. Epub 2015 Jun 30.


While RNA-Seq is increasingly used as the method of choice for transcriptome analysis of mammalian cell culture processes, no universal genomic reference for mapping RNA-Seq reads from CHO cells has been reported. In previous publications, de novo transcriptomes assembled using these RNA-Seq reads were subsequently used for mapping. Potential caveats with this approach include the incomplete coverage and the non-universal nature of the de novo assemblies, leading to challenges in comparing results across studies. In order to facilitate future RNA-Seq studies in CHO cells, we performed a comprehensive evaluation of four public genomic references for CHO cells hosted by the NCBI Reference Sequence Database (RefSeq), including two annotated genomes released in 2012 and 2014 and their accompanying transcriptomes. Each genome showed significantly higher mapped rates compared to its accompanying transcriptome. Furthermore, higher mapped rates in deep intra-genic regions, especially within exons, were observed for the more recent genome release (2014) compared to the older one (2012), indicating that the 2014 genome was the preeminent reference among the four. Sequential addition of human and mouse genomes increased the total mapped rate to 87.3 and 89.7%, respectively, from 73.5% using the 2014 Chinese hamster genome alone. Thus, the sequential combination of the 2014 RefSeq Chinese hamster genome, the Ensembl human genome (h38), and the Ensembl mouse genome (m38) was suggested as the most effective strategy for mapping RNA-Seq data from CHO cells.

Keywords: CHO cells; RNA-Seq; cell culture; genome; mapping algorithms; transcriptome.

MeSH terms

  • Animals
  • CHO Cells
  • Computational Biology / methods*
  • Cricetulus
  • Female
  • Gene Expression Profiling / methods*