Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Oct;33(10):2759-64.
doi: 10.1093/molbev/msw137. Epub 2016 Aug 2.

PoPoolationTE2: Comparative Population Genomics of Transposable Elements Using Pool-Seq

Affiliations

PoPoolationTE2: Comparative Population Genomics of Transposable Elements Using Pool-Seq

Robert Kofler et al. Mol Biol Evol. 2016 Oct.

Abstract

The evolutionary dynamics of transposable elements (TEs) are still poorly understood. One reason is that TE abundance needs to be studied at the population level, but sequencing individuals on a population scale is still too expensive to characterize TE abundance in multiple populations. Although sequencing pools of individuals dramatically reduces sequencing costs, a comparison of TE abundance between pooled samples has been difficult, if not impossible, due to various biases. Here, we introduce a novel bioinformatic tool, PoPoolationTE2, which is specifically tailored for the comparison of TE abundance among pooled population samples or different tissues. Using computer simulations, we demonstrate that PoPoolationTE2 not only faithfully recovers TE insertion frequencies and positions but, by homogenizing the power to identify TEs across samples, it provides an unbiased comparison of TE abundance between pooled population samples. We anticipate that PoPoolationTE2 will greatly facilitate the analysis of TE insertion patterns in a broad range of applications.

Keywords: Pool-Seq; bioinformatics; comparative genomics; comparative population genomics; next generation sequencing; transposable elements.

PubMed Disclaimer

Figures

<sc>Fig</sc>. 1.
Fig. 1.
Overview of PoPoolationTE2. (A) TE insertions (black arrow) result in paired ends (yellow), with one read mapping to a reference chromosome (X) and the other one to a TE (copia). One group of such discordantly mapped reads is located to the left of the insertion (forward signature) and one to the right (reverse signature). (B) The absence of TE insertions results in proper pairs spanning a putative insertion site (green). (C) Mapped paired end reads may be used to generate a base coverage track (gray) and a physical coverage track (green). For the base coverage, the position of the reads is considered whereas for the physical coverage the region between the reads. (D) TE insertions result in paired ends that support a TE insertion (yellow). This can be translated into an additional type of physical coverage (yellow track). The median distance of proper pairs is used to estimate the distance between such discordant pairs. (E) Increasing the inner distance between paired ends compared with panel D results in more reads supporting a TE insertion (copia) and a higher physical coverage. If paired ends are overlapping the physical coverage of individual-paired ends is summed up, contributing to the total height of the physical coverage track. Physical coverage supporting the presence (yellow) and absence (green) of a TE may overlap (central region). (F) Combining the information of all paired ends for each genomic position results in a physical coverage track. (G) To homogenize the power to identify TEs, the physical coverage is randomly sampled to equal levels for each genomic position. (H) The position of signatures of TE insertions is determined using a sliding window (black lines on top) approach and the window with the maximal physical coverage supporting a TE (the red line indicates the window with the highest copia coverage) is used for further analysis. (I) The population frequency of TE signatures is estimated from the ratio of average physical coverage supporting a TE to the total physical coverage in a window (copia =72/(72+18)=0.8). (J) Matching pairs of TE signatures (forward and reverse) of the same TE family within a given distance are joined, yielding a final set of TE insertions. Final population frequency and position estimates are obtained by averaging the estimates for forward and reverse signature. (K) Accuracy of the population frequency estimates for 1,000 TEs in a simulated pooled population. PoPoolationTE2 has a slight upward bias for intermediate frequency TEs and a slight downward bias for high frequency TEs. (L) Accuracy of insertion position estimates for 1,000 TEs in a simulated pooled population.

Similar articles

Cited by

References

    1. Ewing AD. 2015. Transposable element detection from whole genome sequence data. Mob. DNA 6:1–9. - PMC - PubMed
    1. Fiston-Lavier AS, Barrón MG, Petrov DA, González J. 2015. T-lex2: genotyping, frequency estimation and re-annotation of transposable elements using single or pooled next-generation sequencing data. Nucleic Acids Res. 43:e22. - PMC - PubMed
    1. Gilly A, Etcheverry M, Madoui MA, et al. 2014. Te-tracker: systematic identification of transposition events through whole-genome resequencing. BMC Bioinformatics 15:377. - PMC - PubMed
    1. González J, Lenkov K, Lipatov M, Macpherson JM, Petrov DA. 2008. High rate of recent transposable element–induced adaptation in Drosophila melanogaster. PLoS Biol. 6:e251.. - PMC - PubMed
    1. Hénaff E, Zapata L, Casacuberta JM, Ossowski S. 2015. Jitterbug: somatic and germline transposon insertion detection at single-nucleotide resolution. BMC Genomics 16:768.. - PMC - PubMed

Publication types

Substances

LinkOut - more resources