Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov;50(11):1574-1583.
doi: 10.1038/s41588-018-0223-8. Epub 2018 Oct 1.

Sixteen Diverse Laboratory Mouse Reference Genomes Define Strain-Specific Haplotypes and Novel Functional Loci

Free PMC article

Sixteen Diverse Laboratory Mouse Reference Genomes Define Strain-Specific Haplotypes and Novel Functional Loci

Jingtao Lilue et al. Nat Genet. .
Free PMC article


We report full-length draft de novo genome assemblies for 16 widely used inbred mouse strains and find extensive strain-specific haplotype variation. We identify and characterize 2,567 regions on the current mouse reference genome exhibiting the greatest sequence diversity. These regions are enriched for genes involved in pathogen defence and immunity and exhibit enrichment of transposable elements and signatures of recent retrotransposition events. Combinations of alleles and genes unique to an individual strain are commonly observed at these loci, reflecting distinct strain phenotypes. We used these genomes to improve the mouse reference genome, resulting in the completion of 10 new gene structures. Also, 62 new coding loci were added to the reference genome annotation. These genomes identified a large, previously unannotated, gene (Efcab3-like) encoding 5,874 amino acids. Mutant Efcab3-like mice display anomalies in multiple brain regions, suggesting a possible role for this gene in the regulation of brain development.


Figure 1
Figure 1. Genome annotation and content of strain specific haplotypes
(a) Summary of the strain specific gene sets showing the number of genes broken down by GENCODE biotype. (b) Heterozygous SNP density for a 50Mbp interval on chromosome 11 in 200Kbp windows for 17 inbred mouse strains based on sequencing read alignments to the C57BL/6J (GRCm38) reference genome (top). Labels indicate genes overlapping the most dense regions. SNPs visualized in CAST/EiJ and WSB/EiJ for 71.006-71.170Mbp on GRCm38 (bottom), including Derl2, and Mis12 (upper panel) and Nlrp1b (lower panel). Grey indicates the strain base agrees with the reference, other colours indicate SNP differences, and height corresponds to sequencing depth. (c) Total amount of sequence and protein coding genes in regions enriched for heterozygous SNPs (relative to the GRCm38 reference genome) per strain. (d) Top PantherDB categories of coding genes in regions enriched for heterozygous SNPs based on protein class (left). Intersection of genes in the defence/immunity category for the wild-derived and classical inbred strains (right). (e) Box plot of sequence divergence (%), for LTRs, LINEs and SINEs within and outside of heterozygous dense regions. Sequence divergence is relative to a consensus sequence for the transposable element type (n=number of repeats in GRCm38, *** indicated p<0.001 using Welch’s two sample t-test. Box plots show 25th and 75th percentiles, and the median value).
Figure 2
Figure 2. Strain specific alleles for olfactory and immunity loci
(a) Olfactory receptor genes on chromosome 11 of CAST/EiJ. Gene gain/loss and similarity are relative to C57BL/6J. Novel members are named after their most similar homologues. (b) Gene order across Raet1/H60 locus in the collaborative cross parental strains (A/J, NOD/ShiLtJ and 129S1/SvImJ share the same haplotype at this locus, represented by NOD/ShiLtJ). Strain name in black/red indicate Aspergillus fumigatus resistant/susceptible. Dashed box indicates unconfirmed gene order. (c) Novel protein-coding alleles of the Nlrp1 gene family in the wild-derived strains and two classical inbred strains. Colours represent the phylogenetic relationships (top, amino acid neighbor joining tree of NBD domain) and the relative gene order across strains (bottom). (d) A regional dot plot of the Nlrp1 locus in PWK/PhJ compared to the C57BL/6J GRCm38 reference (colour-coded same as panel (c)). Grey blocks indicate repeats and transposable elements.
Figure 3
Figure 3. Efcab3-like locus, evolutionary history, and knockout phenotyping
(a) Comparative Augustus identified an unannotated 188 exon gene (Efcab3-like, red tracks). RNA-Seq splicing from two tissues (B=Brain, L=Liver, blue tracks) and five strains are displayed. Manual annotation extended this gene to 188 exons (lower red track). (b) Evolutionary history of Efcab3-like in vertebrates including genome structure and surrounding genes. The mRNA structure of each gene is shown with white lines on the blue blocks. Novel coding sequence discovered in this study is shown in yellow. Notably, Efcab13 and Efcab3 are fragments of the novel gene Efcab3-like. A recombination event happened in the common ancestor of sub-family Homininae, which disrupted Efcab3-like in gorilla and chimpanzee human. (c) Schematic representation of 22 brain regions plotted in sagittal plane for Efcab3-like mutant male mice (16 weeks of age, n=3) according to p-values (two-tailed equal variance t-test, left). Corresponding brain regions are labelled with a number that is described below the panel (Supplementary Table 15). White colouring indicates a p-value > 0.05 and grey indicates that the brain region could not be confidently tested due to missing data. Histograms showing the neuroanatomical features as percentage increase or decrease of the assessed brain regions in Efcab3-like mutant mice compared to matched controls (right). (d) Representative sagittal brain images of matched controls (left) and Efcab3-like mutant (right), showing a larger cerebellum, enlarged lateral ventricle and increased size of the pontine nuclei (n=3, see Supplementary Figure 15).

Comment in

  • Lab Anim (NY). 2019 Jan;48(1):25

Similar articles

See all similar articles

Cited by 23 articles

See all "Cited by" articles


    1. Beck JA, et al. Genealogies of mouse inbred strains. Nat Genet. 2000;24:23–25. - PubMed
    1. Church DM, et al. Lineage-specific biology revealed by a finished genome assembly of the mouse. PLoS Biol. 2009;7:e1000112. - PMC - PubMed
    1. Svenson KL, et al. Multiple trait measurements in 43 inbred mouse strains capture the phenotypic diversity characteristic of human populations. J Appl Physiol Bethesda Md 1985. 2007;102:2369–2378. - PubMed
    1. Americo JL, Moss B, Earl PL. Identification of wild-derived inbred mouse strains highly susceptible to monkeypox virus infection for use as small animal models. J Virol. 2010;84:8172–8180. - PMC - PubMed
    1. Ideraabdullah FY, et al. Genetic and haplotype diversity among wild-derived mouse inbred strains. Genome Res. 2004;14:1880–1887. - PMC - PubMed

Publication types

LinkOut - more resources