Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 7 (5), 1405-1416

A New Reference Genome Assembly for the Microcrustacean Daphnia pulex


A New Reference Genome Assembly for the Microcrustacean Daphnia pulex

Zhiqiang Ye et al. G3 (Bethesda).


Comparing genomes of closely related genotypes from populations with distinct demographic histories can help reveal the impact of effective population size on genome evolution. For this purpose, we present a high quality genome assembly of Daphnia pulex (PA42), and compare this with the first sequenced genome of this species (TCO), which was derived from an isolate from a population with >90% reduction in nucleotide diversity. PA42 has numerous similarities to TCO at the gene level, with an average amino acid sequence identity of 98.8 and >60% of orthologous proteins identical. Nonetheless, there is a highly elevated number of genes in the TCO genome annotation, with ∼7000 excess genes appearing to be false positives. This view is supported by the high GC content, lack of introns, and short length of these suspicious gene annotations. Consistent with the view that reduced effective population size can facilitate the accumulation of slightly deleterious genomic features, we observe more proliferation of transposable elements (TEs) and a higher frequency of gained introns in the TCO genome.

Keywords: effective population size; gene number; genome annotation; intron; mobile elements.


Figure 1
Figure 1
Distributions of the features of protein-coding sequences for the 11,694 one-to-one orthologs in TCO and PA42 (>300 bp alignment length). (A) Amino acid sequence identity. (B) Pairwise divergence at silent sites, Ks. (C) Pairwise divergence at replacement sites, Ka. (D) The ratio Ka/Ks.
Figure 2
Figure 2
Comparison of the features of 1:1 TCO–PA42 orthologs and TCO-specific genes. TCO-specific genes have no obvious orthologs in PA42 or the Reference Species (other metazoans). (A) Distributions of intron numbers. (B) Length distributions for all coding sequences (excluding introns). (C) Comparisons of average GC contents. (D) Comparisons of average sequence coverages. Error bars indicate SEs. Asterisk denotes significance at the P < 0.05 level.
Figure 3
Figure 3
The frequency distribution of KS for paralogs within the PA42 genome vs. those within TCO. Only genes in orthologous clusters containing both TCO and PA42 genes were used. Sliding-window analyses were used to remove the low quality regions in the alignments, with a cutoff of 0.4 identity for each 15-bp window. The KS value for each paralog is the average value when comparing the paralog with others in an orthologous cluster of the genome. The vertical dashed line at 0.057 denotes the average silent-site divergence for pairs of TCO-PA42 orthologs, so that paralogous pairs to the left of this benchmark are younger than the average ortholog divergence between these two clones. KS, pairwise divergence at silent sites.
Figure 4
Figure 4
Distributions of the synonymous (Ks) and nonsynonymous (Ka) differences per site per gene in TCO and PA42, using D. obtusa as outgroup. Ks and Ka were first calculated in TCO vs. D. obtusa and PA42 vs. D. obtusa, with the plotted difference providing a measure of the increase in the subtending branch length for TCO above that for the PA42 lineage.
Figure 5
Figure 5
Dynamic evolution of gene families. The gene family expansions and contractions were predicted by CAFÉ 3.0. The species tree required by CAFÉ 3.0 was constructed by 1:1:1 single-copy gene families using the Maximum Likelihood method in MEGA6 (Tamura et al. 2013). The RelTime-ML program implemented in the MEGA6 package was used to estimate divergence time among species; calibration time was obtained from the TimeTree database.

Similar articles

  • Characterization of newly gained introns in Daphnia populations.
    Li W, Kuzoff R, Wong CK, Tucker A, Lynch M. Li W, et al. Genome Biol Evol. 2014 Aug 14;6(9):2218-34. doi: 10.1093/gbe/evu174. Genome Biol Evol. 2014. PMID: 25123113 Free PMC article.
  • Extensive, recent intron gains in Daphnia populations.
    Li W, Tucker AE, Sung W, Thomas WK, Lynch M. Li W, et al. Science. 2009 Nov 27;326(5957):1260-2. doi: 10.1126/science.1179302. Science. 2009. PMID: 19965475 Free PMC article.
  • The ecoresponsive genome of Daphnia pulex.
    Colbourne JK, Pfrender ME, Gilbert D, Thomas WK, Tucker A, Oakley TH, Tokishita S, Aerts A, Arnold GJ, Basu MK, Bauer DJ, Cáceres CE, Carmel L, Casola C, Choi JH, Detter JC, Dong Q, Dusheyko S, Eads BD, Fröhlich T, Geiler-Samerotte KA, Gerlach D, Hatcher P, Jogdeo S, Krijgsveld J, Kriventseva EV, Kültz D, Laforsch C, Lindquist E, Lopez J, Manak JR, Muller J, Pangilinan J, Patwardhan RP, Pitluck S, Pritham EJ, Rechtsteiner A, Rho M, Rogozin IB, Sakarya O, Salamov A, Schaack S, Shapiro H, Shiga Y, Skalitzky C, Smith Z, Souvorov A, Sung W, Tang Z, Tsuchiya D, Tu H, Vos H, Wang M, Wolf YI, Yamagata H, Yamada T, Ye Y, Shaw JR, Andrews J, Crease TJ, Tang H, Lucas SM, Robertson HM, Bork P, Koonin EV, Zdobnov EM, Grigoriev IV, Lynch M, Boore JL. Colbourne JK, et al. Science. 2011 Feb 4;331(6017):555-61. doi: 10.1126/science.1197761. Science. 2011. PMID: 21292972 Free PMC article.
  • A beginner's guide to eukaryotic genome annotation.
    Yandell M, Ence D. Yandell M, et al. Nat Rev Genet. 2012 Apr 18;13(5):329-42. doi: 10.1038/nrg3174. Nat Rev Genet. 2012. PMID: 22510764 Review.
  • Discovering and detecting transposable elements in genome sequences.
    Bergman CM, Quesneville H. Bergman CM, et al. Brief Bioinform. 2007 Nov;8(6):382-92. doi: 10.1093/bib/bbm048. Epub 2007 Oct 10. Brief Bioinform. 2007. PMID: 17932080 Review.
See all similar articles

Cited by 17 articles

See all "Cited by" articles


    1. Asselman J., Pfrender M. E., Lopez J. A., De Coninck D. I., Janssen C. R., et al. , 2015. Conserved transcriptional responses to cyanobacterial stressors are mediated by alternate regulation of paralogous genes in Daphnia. Mol. Ecol. 24: 1844–1855. - PubMed
    1. Bergman C. M., Pfeiffer B. D., Rincón-Limas D. E., Hoskins R. A., Gnirke A., et al. , 2002. Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome. Genome Biol. 3: RESEARCH0086. - PMC - PubMed
    1. Boetzer M., Pirovano W., 2014. SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information. BMC Bioinformatics 15: 211. - PMC - PubMed
    1. Bolger A. M., Lohse M., Usadel B., 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30: 2114–2120. - PMC - PubMed
    1. Chen S., McKinney G. J., Nichols K. M., Colbourne J. K., Sepulveda M. S., 2015. Novel cadmium responsive microRNAs in Daphnia pulex. Environ. Sci. Technol. 49: 14605–14613. - PubMed

Publication types

MeSH terms


LinkOut - more resources