T-lex: a program for fast and accurate assessment of transposable element presence using next-generation sequencing data

Nucleic Acids Res. 2011 Mar;39(6):e36. doi: 10.1093/nar/gkq1291. Epub 2010 Dec 21.

Abstract

Transposable elements (TEs) are repetitive DNA sequences that are ubiquitous, extremely abundant and dynamic components of practically all genomes. Much effort has gone into annotation of TE copies in reference genomes. The sequencing cost reduction and the newly available next-generation sequencing (NGS) data from multiple strains within a species offer an unprecedented opportunity to study population genomics of TEs in a range of organisms. Here, we present a computational pipeline (T-lex) that uses NGS data to detect the presence/absence of annotated TE copies. T-lex can use data from a large number of strains and returns estimates of population frequencies of individual TE insertions in a reasonable time. We experimentally validated the accuracy of T-lex detecting presence or absence of 768 previously identified TE copies in two resequenced Drosophila melanogaster strains. Approximately 95% of the TE insertions were detected with 100% sensitivity and 97% specificity. We show that even at low levels of coverage T-lex produces accurate results for TE copies that it can identify reliably but that the rate of 'no data' calls increases as the coverage falls below 15×. T-lex is a broadly applicable and flexible tool that can be used in any genome provided the availability of the reference genome, individual TE copy annotation and NGS data.

Publication types

  • Research Support, N.I.H., Extramural
  • Validation Study

MeSH terms

  • Animals
  • DNA Transposable Elements*
  • Drosophila melanogaster / genetics
  • Genomics / methods
  • High-Throughput Nucleotide Sequencing
  • Sequence Analysis, DNA / methods*
  • Software*

Substances

  • DNA Transposable Elements