Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2015 Mar 11;7(4):1192-205.
doi: 10.1093/gbe/evv050.

De novo assembly and annotation of the Asian tiger mosquito (Aedes albopictus) repeatome with dnaPipeTE from raw genomic reads and comparative analysis with the yellow fever mosquito (Aedes aegypti)

Affiliations
Comparative Study

De novo assembly and annotation of the Asian tiger mosquito (Aedes albopictus) repeatome with dnaPipeTE from raw genomic reads and comparative analysis with the yellow fever mosquito (Aedes aegypti)

Clément Goubert et al. Genome Biol Evol. .

Abstract

Repetitive DNA, including transposable elements (TEs), is found throughout eukaryotic genomes. Annotating and assembling the "repeatome" during genome-wide analysis often poses a challenge. To address this problem, we present dnaPipeTE-a new bioinformatics pipeline that uses a sample of raw genomic reads. It produces precise estimates of repeated DNA content and TE consensus sequences, as well as the relative ages of TE families. We shows that dnaPipeTE performs well using very low coverage sequencing in different genomes, losing accuracy only with old TE families. We applied this pipeline to the genome of the Asian tiger mosquito Aedes albopictus, an invasive species of human health interest, for which the genome size is estimated to be over 1 Gbp. Using dnaPipeTE, we showed that this species harbors a large (50% of the genome) and potentially active repeatome with an overall TE class and order composition similar to that of Aedes aegypti, the yellow fever mosquito. However, intraorder dynamics show clear distinctions between the two species, with differences at the TE family level. Our pipeline's ability to manage the repeatome annotation problem will make it helpful for new or ongoing assembly projects, and our results will benefit future genomic studies of A. albopictus.

Keywords: Aedes albopictus; TE analysis; Trinity; bioinformatic pipeline; repeated DNA; transposable elements.

PubMed Disclaimer

Figures

F<sc>ig</sc>. 1.—
Fig. 1.—
Overview of the dnaPipeTE pipeline. First, genomic reads in FASTQ format are sampled. Then, assembly of repeats is performed using two or more iterations of Trinity. For each iteration, the previously assembled reads are added to the next sample to improve the repeat assembly. In the next step, assembled contigs are annotated using RepeatMasker. Finally, reads from the “BLAST sample” are blasted against all the contigs to estimate the relative abundance of each assembled repeat and to compute the TE landscape. In a second BLAST, the same sample is successively blasted against the annotated contigs joined to the Repbase library, then with the unannotated contigs in order to retrieve copies that would not have been assembled and to obtain a more global repeat content estimation. See text for additional details.
F<sc>ig</sc>. 2.—
Fig. 2.—
Classification procedure of RepeatMasker annotation for the dnaPipeTE contigs. According to the alignment overlap between the query (a/Q) and the subject (a/S), the dnaPipeTE contigs are annotated as one of the three categories. “Hit” is the weakest annotation, while partial and full-length indicate that the dnaPipeTE contig has annotated along more than 80% of its length.
F<sc>ig</sc>. 3.—
Fig. 3.—
Relative genome proportions of the main repeat classes (pie charts) and TE landscapes (bar plots) from RepeatMasker on assembled genome (left) and dnaPipeTE (right, BLASTN with 0.25× genome coverage) for Drosophila melanogaster strain w1118. RepeatMasker analysis data were downloaded from http://repeatmasker.org and retranscribed according to the name used for annotation in dnaPipeTE.
F<sc>ig</sc>. 4.—
Fig. 4.—
Relative genome proportions of the main repeat classes found in Aedes albopictus using dnaPipeTE, from a nucleotide BLAST of 1,414,634 reads (0.1×) against the repeat assemblies performed with a total of 2,829,268 reads (0.2×).
F<sc>ig</sc>. 5.—
Fig. 5.—
TE age distribution comparisons between Aedes albopictus (left) and Aedes aegypti (right). For each species, the nucleotide divergence from BLASTN is reported between a repeat read and the contig, where it matches the dnaPipeTE assembly.
F<sc>ig</sc>. 6.—
Fig. 6.—
Comparison of the relative genome proportions of shared TE families between Aedes albopictus and Aedes aegypti in terms of genome percentage (log10 scale). Each dot represents a shared TE family, defined by a more similar BLAT hit between the TE family reference contig of each species. Names on the graphs correspond to the main TE annotation (from A. aegypti) discussed in the text.
F<sc>ig</sc>. 7.—
Fig. 7.—
Linear regression of genome size over TE content in mosquitoes. Except for Aedes albopictus, data come from complete sequenced genomes cited in the text. (r2 = 0.827, P < 0.01).

Similar articles

Cited by

References

    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. - PubMed
    1. Arensburger P, Hice RH, Wright JA, Craig NL, Atkinson PW. The mosquito Aedes aegypti has a large genome size and high transposable element load but contains a low proportion of transposon-specific piRNAs. BMC Genomics. 2011;12:606. - PMC - PubMed
    1. Beck CR, Garcia-Perez JL, Badge RM, Moran JV. LINE-1 elements in structural variation and disease. Annu Rev Genomics Hum Genet. 2011;12:187–215. - PMC - PubMed
    1. Bellini R, et al. Dispersal and survival of Aedes albopictus (Diptera: Culicidae) males in Italian urban areas and significance for sterile insect technique application. J Med Entomol. 2010;47:1082–1091. - PubMed
    1. Biedler J, Tu Z. Non-LTR retrotransposons in the African malaria mosquito, Anopheles gambiae: unprecedented diversity and evidence of recent activity. Mol Biol Evol. 2003;20:1811–1825. - PubMed

Publication types