Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jun 1;24(3):279-287.
doi: 10.1093/dnares/dsw064.

Comparative analyses of the major royal jelly protein gene cluster in three Apis species with long amplicon sequencing

Affiliations

Comparative analyses of the major royal jelly protein gene cluster in three Apis species with long amplicon sequencing

Sophie Helbing et al. DNA Res. .

Abstract

The western honeybee, Apis mellifera is a prominent model organism in the field of sociogenomics and a recent upgrade substantially improved annotations of the reference genome. Nevertheless, genome assemblies based on short-sequencing reads suffer from problems in regions comprising e.g. multi-copy genes. We used single-molecule nanopore-based sequencing with extensive read-lengths to reconstruct the organization of the major royal jelly protein (mrjp) region in three species of the genus Apis. Long-amplicon sequencing provides evidence for lineage-specific evolutionary fates of Apis mrjps. Whereas the most basal species, A. florea, seems to encode ten mrjps, different patterns of gene loss and retention were observed for A. mellifera and A. dorsata. Furthermore, we show that a previously reported pseudogene in A. mellifera, mrjp2-like, is an assembly artefact arising from short read sequencing.

Keywords: Apis dorsata; Apis florea; Apis mellifera; MinION™; gene duplication.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Graphical presentation of the data analysis pipeline. 2D raw reads (1) were size-selected (minimum read length of 6.5 kb) (2) and mapped against the three reference genomes, in order to assign the reads by species and genomic target (3). Only those reads that matched our quality filters (similarity fraction: 0.6 [0.5 for adAmp3, 6 and 7], length fraction: 0.7 [0.5 for adAmp3, 6 and 7]) were included in further analyses (4). Per amplicon sixteen reads (minimum number of reads that mapped to an amplicon—amAmp6) were selected and aligned to each other independent of a reference sequence to build the nanopore-derived consensus sequence (5). Finally, the consensus sequence and the reference sequence were aligned (6). In order to correct the genomic reference sequences of the mrjp gene cluster of A. mellifera, A. florea and A. dorsata, assembly gaps (N) and local mis-assemblies were identified based on this consensus/reference sequence alignment. Assembly gaps (N) in the reference sequence were replaced with the consensus sequence and mis-assemblies were either discarded (when only present in the reference but not in the consensus sequence) or included (when only present in the consensus but not in the reference sequence).
Figure 2
Figure 2
Error patterns of sequencing reads. (A) Average (mean ±SE) insertion, deletion and substitution per aligned base for individual reads (n = 21), consensus sequences based on three reads (n = 4), five reads (n = 4), seven reads (n = 4), 11 reads (n = 3) and 21 reads (n = 2). Ambiguous refers to unknown nucleotides (N) in consensus sequence. (B) Substitution matrix. (C) Distribution of deletions according to the underlying local sequence characteristics: homopolymer stretches, dinucleotide repeats, other repeat types or no patterns. 86% of deletions occur in homopolymer stretches. (D) Dependency of the number of deleted nucleotides on the length of the homopolymer stretch.
Figure 3
Figure 3
Schematic organization of the major royal jelly protein gene cluster across three species of the genus Apis. To emphasize the structural organization of mrjps, exons are illustrated in black. Grey arrows refer to putative mrjps based on draft genome sequences. Black destructed arrows illustrate pseudogenized genes. Location of the respective amplicons (adAmp 1–7; afAmp 1–6; amAmp 1–6) within the cluster is also illustrated; with dashed lines referring to fragments deviating from expected product sizes. ad, A. dorsata; am, A. mellifera; af, A. florea. Gene names are the corrected ones after the phylogenetic analysis.
Figure 4
Figure 4
Phylogeny of Apis major royal jelly proteins (MRJP). The maximum-likelihood tree was reconstructed from aligned amino acid sequences using Jones–Taylor–Thornton, including bootstrapping (500 replications). A discrete Gamma distribution was used to model evolutionary rate differences among sites (five categories, G parameter = 2.1115). All positions containing gaps and missing data were eliminated. There were a total of 365 positions in the final dataset. Model selection was performed using MEGA version 5 (Tamura et al.25). Sequences for afMRJP10 (old = afMRJP2), afMRJP2 (old = afMRJPψ) and afMRJP9 were derived from the draft genome (scaffold 1824). Ad, A. dorsata; am, A. mellifera; af, A. florea.

Similar articles

Cited by

References

    1. Salzberg S. L., Yorke J. A.. 2005, Beware of mis-assembled genomes. Bioinformatics, 21, 4320–1. - PubMed
    1. Kelley D. R., Salzberg S. L.. 2010, Detection and correction of false segmental duplications caused by genome mis-assembly. Genome Biol., 11, R28. - PMC - PubMed
    1. Zhang Q., Backström N.. 2013, Assembly errors cause false tandem duplicate regions in the chicken (Gallus gallus) genome sequence. Chromosoma, 123, 165–8. - PubMed
    1. Drapeau M. D., Albert S., Kucharski R., Prusko C., Maleszka R.. 2006, Evolution of the Yellow/Major Royal Jelly Protein family and the emergence of social behavior in honey bees. Genome Res., 1385–94. - PMC - PubMed
    1. Honeybee Genome Sequencing Consortium. 2006, Insights into social insects from the genome of the honeybee Apis mellifera. Nature, 443, 931–49. - PMC - PubMed