Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 50 (2), 270-277

The Sea Lamprey Germline Genome Provides Insights Into Programmed Genome Rearrangement and Vertebrate Evolution


The Sea Lamprey Germline Genome Provides Insights Into Programmed Genome Rearrangement and Vertebrate Evolution

Jeramiah J Smith et al. Nat Genet.

Erratum in


The sea lamprey (Petromyzon marinus) serves as a comparative model for reconstructing vertebrate evolution. To enable more informed analyses, we developed a new assembly of the lamprey germline genome that integrates several complementary data sets. Analysis of this highly contiguous (chromosome-scale) assembly shows that both chromosomal and whole-genome duplications have played significant roles in the evolution of ancestral vertebrate and lamprey genomes, including chromosomes that carry the six lamprey HOX clusters. The assembly also contains several hundred genes that are reproducibly eliminated from somatic cells during early development in lamprey. Comparative analyses show that gnathostome (mouse) homologs of these genes are frequently marked by polycomb repressive complexes (PRCs) in embryonic stem cells, suggesting overlaps in the regulatory logic of somatic DNA elimination and bivalent states that are regulated by early embryonic PRCs. This new assembly will enhance diverse studies that are informed by lampreys' unique biology and evolutionary/comparative perspective.

Conflict of interest statement


EEE is on the scientific advisory board (SAB) of DNAnexus, Inc.


Figure 1
Figure 1. Distribution of k-mer copy numbers in germline shotgun sequencing data
a) The spectrum of error corrected 25-mers reveals a modal count of 68 and a second hump at half of this value, corresponding to allelic k-mers. k-mer multiplicity is defined as the number of times a k-mer was observed in the sequence dataset. b) Less than 40% of the lamprey genome can be represented by single-copy 25-mers, whereas >75% of the human genome can be represented by single-copy k-mers of this same length. The X-axis is plotted on a log scale to aid in visualization of patterns at lower estimated copy numbers.
Figure 2
Figure 2. Long-range scaffolding and assessment of long-range contiguity of lamprey super-scaffolds
Data from three independent strategies were used to place contigs on larger chromosomal structures. Data from meiotic maps (blue), Dovetail maps (red) and optical maps (green) complement and extend one another. a) Information used to generate super-scaffold 5, b) Ordering of anchors along super-scaffold 5. c) Information used to generate super-scaffold 21, d) Ordering of anchors along super-scaffold 21. ρ = Pearson correlation coefficient based on the following numbers of markers, Panel a, top to bottom: n=18, 28, 14, 10, 34, 156, 78 and 162 independent scaffolding anchors; Panel b, top to bottom: n=10, 22, 36, 196 and 79 independent scaffolding anchors.
Figure 3
Figure 3. Alignment of the Pacific lamprey (E. tridentatus) meiotic map to assembled sea lamprey (P. marinus) super-scaffolds
The relative position of homologous sequences is shown for sea lamprey (y-axis) and pacific lamprey (X-axis). A single homologous site (aligning RAD-seq read, Supplementary Table 1) is marked by a single dot. Chromosomes and linkage groups (LGs) are ordered from longest to shortest within species and individual chromosomes/LGs are highlighted by alternating dark and light shading. Groups of adjacent dots (regions showing conservation of synteny and gene order) appear as diagonal lines.
Figure 4
Figure 4. The distribution of conserved syntenies in chicken and lamprey reveals patterns of ancient large-scale duplication
These patterns are consistent with those from the lamprey somatic genome assembly and reveal both chromosomal/segmental and whole genome duplications. Lamprey super-scaffolds are oriented along the y-axis and chicken chromosomes are oriented along the x-axis. Circles reflect counts of syntenic orthologs on the corresponding lamprey and chicken chromosomes, with the size of each circle being proportional to the number of orthologs on that pair. The color of each circle represents the degree to which the number of observed orthologs deviates from null expectations under a uniform distribution across an identical number of lamprey and chicken chromosomes with identical numbers of orthology-informative genes. Shaded regions of the plot designate homology groups that correspond to presumptive ancestral chromosomes. Syntenic groups that are linked by lines marked EA are predicted to correspond to a single chromosome in the Euteleostome ancestor, based one conserved synteny with spotted gar (Lepisosteus oculatus). The three largest super-scaffolds are marked with an arrow along the y-axis. The ordering of lamprey super-scaffolds along the y-axis is provided in Supplementary Table 4.
Figure 5
Figure 5. Structure and Evolution of HOX clusters
a) Six Hox clusters can be identified within the sea lamprey genome assembly. Lamprey cluster designations α through ζ follow the convention of Mehta et al. Hox genes are represented as boxes, with the direction of their transcription indicated by the black arrow. Flanking non-Hox genes are depicted as arrowheads, which indicate their direction of transcription. The positions of known micro-RNAs are indicated. The four human Hox loci and the inferred ancestral vertebrate Hox locus are shown for comparison. The white arrow downstream of the lamprey Hox-γ cluster represents PMZ_0048273, an uncharacterized non-Hox gene. b) The evolutionary history was inferred using the Neighbor-Joining method. The optimal tree with the sum of branch length = 9.68 is shown. The percentage of replicate trees in which the associated taxa clustered together (bootstrap test with 100 replicates) are shown next to the branches. c) Tests for enrichment of 2-copy duplicates among all pairs of Hox-bearing chromosomes (super-scaffolds). Colors correspond to the degree to which the counts of shared duplicates on each pair of chromosomes deviates from the expected value given an identical number of chromosomes and paralogs retained on each chromosome (Probability estimates were generated using two-tailed χ2 tests and a total of n=200 independent pairs of duplicated genes: see Supplementary Table 6). Plus and minus symbols indicate the direction of deviation from expected for chromosome pairs with P<0.01.
Figure 6
Figure 6. Germline Enrichment of Single/Low-Copy DNA Sequences
Comparative sequencing reveals germline enrichment of several single/low-copy intervals. The distribution of coverage ratios reveals a long tail corresponding to segments with higher sequence coverage in sperm relative to blood. This excess is highlighted in red, assuming a symmetrical distribution of enrichment scores for non-eliminated regions and an absence of somatic-specific sequence.
Figure 7
Figure 7. Enrichment analysis provides insight into the function of germline specific sequences
Homologs of eliminated genes show strong overlap for the binding targets of polycomb repressive complexes in mouse embryonic stem cells (ESCs) and the binding sites of transcription factors in multipotent progenitor lineages and cancer cells (from ChEA 2016). Red cells denote ChIP experiments (x-axis) that identify peaks overlapping orthologs of lamprey genes (y-axis). ChIP enrichment statistics and ordering along the x-axis are provided in Supplementary Table 9. Labels GS1, GS2 and GS3 denote three primary clusters of germline-specific genes, C1 and C2 denote two primary clusters of ChIP experiments.

Similar articles

See all similar articles

Cited by 32 PubMed Central articles

See all "Cited by" articles


    1. Parker HJ, Bronner ME, Krumlauf R. A Hox regulatory network of hindbrain segmentation is conserved to the base of vertebrates. Nature. 2014;514:490–493. doi: 10.1038/nature13723. - DOI - PMC - PubMed
    1. Green SA, Simoes-Costa M, Bronner ME. Evolution of vertebrates as viewed from the crest. Nature. 2015;520:474–482. doi: 10.1038/nature14436. - DOI - PMC - PubMed
    1. Sower SA, et al. Emergence of an Ancestral Glycoprotein Hormone in the Pituitary of the Sea Lamprey, a Basal Vertebrate. Endocrinology. 2015;156:3026–3037. doi: 10.1210/en.2014-1797. - DOI - PubMed
    1. Smith JJ, Keinath MC. The sea lamprey meiotic map improves resolution of ancient vertebrate genome duplications. Genome Res. 2015;25:1081–1090. doi: 10.1101/gr.184135.114. - DOI - PMC - PubMed
    1. Das S, et al. Evolution of two prototypic T cell lineages. Cell Immunol. 2015;296:87–94. doi: 10.1016/j.cellimm.2015.04.007. - DOI - PMC - PubMed

Publication types