Increased functional protein expression using nucleotide sequence features enriched in highly expressed genes in zebrafish

Nucleic Acids Res. 2015 Apr 20;43(7):e48. doi: 10.1093/nar/gkv035. Epub 2015 Jan 27.


Many genetic manipulations are limited by difficulty in obtaining adequate levels of protein expression. Bioinformatic and experimental studies have identified nucleotide sequence features that may increase expression, however it is difficult to assess the relative influence of these features. Zebrafish embryos are rapidly injected with calibrated doses of mRNA, enabling the effects of multiple sequence changes to be compared in vivo. Using RNAseq and microarray data, we identified a set of genes that are highly expressed in zebrafish embryos and systematically analyzed for enrichment of sequence features correlated with levels of protein expression. We then tested enriched features by embryo microinjection and functional tests of multiple protein reporters. Codon selection, releasing factor recognition sequence and specific introns and 3' untranslated regions each increased protein expression between 1.5- and 3-fold. These results suggested principles for increasing protein yield in zebrafish through biomolecular engineering. We implemented these principles for rational gene design in software for codon selection (CodonZ) and plasmid vectors incorporating the most active non-coding elements. Rational gene design thus significantly boosts expression in zebrafish, and a similar approach will likely elevate expression in other animal models.

Publication types

  • Research Support, N.I.H., Intramural

MeSH terms

  • Animals
  • Animals, Genetically Modified
  • Blotting, Western
  • Codon
  • Computational Biology
  • Gene Expression Profiling*
  • Microinjections
  • Molecular Sequence Data
  • Protein Biosynthesis
  • Zebrafish / genetics*
  • Zebrafish Proteins / genetics*


  • Codon
  • Zebrafish Proteins

Associated data

  • GENBANK/KM458762
  • GENBANK/KM458763
  • GENBANK/KM458764
  • GENBANK/KM458765
  • GENBANK/KM458766
  • GENBANK/KM458767