Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 5, 16413

Initial Characterization of the Large Genome of the Salamander Ambystoma Mexicanum Using Shotgun and Laser Capture Chromosome Sequencing


Initial Characterization of the Large Genome of the Salamander Ambystoma Mexicanum Using Shotgun and Laser Capture Chromosome Sequencing

Melissa C Keinath et al. Sci Rep.


Vertebrates exhibit substantial diversity in genome size, and some of the largest genomes exist in species that uniquely inform diverse areas of basic and biomedical research. For example, the salamander Ambystoma mexicanum (the Mexican axolotl) is a model organism for studies of regeneration, development and genome evolution, yet its genome is ~10× larger than the human genome. As part of a hierarchical approach toward improving genome resources for the species, we generated 600 Gb of shotgun sequence data and developed methods for sequencing individual laser-captured chromosomes. Based on these data, we estimate that the A. mexicanum genome is ~32 Gb. Notably, as much as 19 Gb of the A. mexicanum genome can potentially be considered single copy, which presumably reflects the evolutionary diversification of mobile elements that accumulated during an ancient episode of genome expansion. Chromosome-targeted sequencing permitted the development of assemblies within the constraints of modern computational platforms, allowed us to place 2062 genes on the two smallest A. mexicanum chromosomes and resolves key events in the history of vertebrate genome evolution. Our analyses show that the capture and sequencing of individual chromosomes is likely to provide valuable information for the systematic sequencing, assembly and scaffolding of large genomes.


Figure 1
Figure 1. Distribution of 31-mer frequencies among >0.6 terabases of quality filtered sequence data generated from a single female A. mexicanum.
(A) The observed distribution is humped with a peak at k-mer multiplicities of 13 and 14 (estimated mean of 13.50), presumably corresponding to k-mers that were sampled from the single-copy fraction of the genome. The k-mer multiplicity corresponding to 3 standard deviations above the mean of the single copy distribution (33.67) is marked by an arrow. (B) Decomposition of the observed distribution assuming symmetrical single-copy (diploid: 2N) and allelic (1N) k-mer distributions. The sum of all bins at a given multiplicity in panel B is equal to the observed multiplicity presented in Panel (A). (C) Low-copy k-mers account for the majority of Ambystoma shotgun sequence data and k-mers present at increasing copy number represent decreasing fractions of the shotgun dataset, suggesting that the diversity of repetitive sequences scales inversely with copy number. The region of the plot highlighted in grey represents copy number ranges that could plausibly exist at a copy number of ~1 per chromosome. The X-axis is plotted on a log scale to aid in visualization of patterns at lower estimated copy numbers.
Figure 2
Figure 2. Estimation of sequence coverage and repeat content by alignment to assembled BAC clones.
(A) The observed distributions are humped with peak depths of coverage between 19 and 20, consistent with estimates from analysis of k-mer frequencies. (B) Low-coverage bases account for ~40% of Ambystoma BAC sequence data and bases present at increasing copy numbers represent decreasing fractions of the BAC sequences, further suggesting that the diversity of repetitive sequences scales inversely with copy number.
Figure 3
Figure 3. Distribution of repetitive elements in the axolotl genome.
Chromosomes were hybridized with Cy3-dUTP labelled COT DNA (Panel (A)) and stained with DAPI (Panel (B)). This fraction of COT DNA contains the rapidly annealing (repetitive) portion of the genome and comprises ~45% of input DNA. Hybridization patterns show that repetitive DNA is heavily clustered at the centromeres and broadly distributed across all chromosomal arms.
Figure 4
Figure 4. Mapping of reads generated by laser capture sequencing.
Read mapping was used to assess the sensitivity and specificity of laser capture and amplification libraries. (A) The proportion of Ambystoma markers with nearly identical reads recovered from chromosome-targeted sequencing. Markers from target vs. off target linkage groups are presented separately. (B) The distribution of markers sampled from chromosome 13 (LGs 15 and 17) via targeted sequencing. Dots represent markers with mapped reads from each experimental series. Red, blue, green and purple dots denote markers that were sampled by reads (near perfect matches) from libraries 3, 5, 6 and 12, respectively. (C) The distribution of markers sampled from chromosome 14 (LG 14) via targeted sequencing. Red and blue dots denote markers that were sampled by reads from libraries 7 and 9, respectively.
Figure 5
Figure 5. Estimation of coverage by alignment to assembled contigs from AM13 and AM14.
The observed distributions are humped with peak depths of coverage between 19 and 20, consistent with estimates from alignments to BAC clones and analysis of k-mer frequencies. MQ30 = data are filtered to include only alignments with a map quality >= 30, MQ50 = data are filtered to include only alignments with a map quality >= 50.
Figure 6
Figure 6. Conserved synteny between assembled A. mexicanum chromosomes and the chicken genome.
(A) Tests for enrichment of AM13 (LG15/17 targeted) and AM14 (LG14 targeted) presumptive gene orthologs across all assembled chicken chromosomes. “Enrichment” is defined as the observed number of orthologs divided by the total number of genes that have been annotated to the chromosome. (B) The distribution of AM14 orthologs along chicken chromosome 5 reveals a discontinuous distribution consistent with the interpretation that chicken chromosome 5 was shaped by an ancestral fusion event, and a subsequent pericentric inversion.
Figure 7
Figure 7. Summary of major repetitive element classes identified within assembled chromosomes.
Percentages are shown separately for the two chromosomal assemblies. LINEs (Long Interspersed Nuclear Elements), LTRs (Long Terminal Repeat), Penelope and SINEs (Short Interspersed Nuclear Elements) are retroelement subclasses. Hobo-Activator and Tourist/Harbinger elements are DNA transposon subclasses. L1, L2 and RTE/Bov-B elements are LINE subclasses. Gypsy and Retroviral elements are LTR subclasses.
Figure 8
Figure 8. Diversity and abundance of repetitive elements in assembled scaffolds from AM13 and AM14.
(A) Divergence between identified repeats and their RepeatMasker consensus sequence, using only information from A. mexicanum (model repeat). (B) The cumulative contribution (by length) of these same repeat classes. In both panels, patterns are shown for several classes. Known elements are comprised of LINEs, LTRs, DNA elements and other classes that are present at lower abundances (see Fig. 7). The class “All” consists of both known and unknown repeat classes.

Similar articles

See all similar articles

Cited by 28 PubMed Central articles

See all "Cited by" articles


    1. Gregory T. R. et al. Eukaryotic genome size databases. Nucleic Acids Res. 35, D332–D338 (2007). - PMC - PubMed
    1. Voss S. R. & Smith J. J. Evolution of salamander life cycles: a major-effect quantitative trait locus contributes to discrete and continuous variation for metamorphic timing. Genetics 170, 275–281 (2005). - PMC - PubMed
    1. Smith J. J. & Voss S. R. Gene order data from a model amphibian (Ambystoma): new perspectives on vertebrate genome structure and evolution. BMC. Genomics 7, 219 (2006). - PMC - PubMed
    1. Voss G. J., Kump D. K., Walker J. A. & Voss S. R. Variation in salamander tail regeneration is associated with genetic factors that determine tail morphology. PLoS One 8, e67274, 10.1371/journal.pone.0067274 (2013). - DOI - PMC - PubMed
    1. Page R. B. et al. Effect of thyroid hormone concentration on the transcriptional response underlying induced metamorphosis in the Mexican axolotl (Ambystoma). BMC Genomics 9, 78, 10.1186/1471-2164-9-78 (2008). - DOI - PMC - PubMed

Publication types

Associated data