Transposable elements (TEs) are dynamic components of genomes that often vary in copy number among members of the same species. With the advent of next-generation sequencing TE insertion-site polymorphism can be examined at an unprecedented level of detail when combined with easy-to-use bioinformatics software. Here we report a new tool, RelocaTE, that rapidly identifies specific TE insertions that are either polymorphic or shared between a reference and unassembled next-generation sequencing reads. Furthermore, a novel companion tool, CharacTErizer, exploits the depth of coverage to classify genotypes of nonreference insertions as homozygous, heterozygous or, when analyzing an active TE family, as rare somatic insertion or excision events. It does this by comparing the numbers of RelocaTE aligned reads to reads that map to the same genomic position without the TE. Although RelocaTE and CharacTErizer can be used for any TE, they were developed to analyze the very active mPing element which is undergoing massive amplification in specific strains of Oryza sativa (rice). Three individuals of one of these strains, A123, were resequenced and analyzed for mPing insertion site polymorphisms. The majority of mPing insertions found (~97%) are not present in the reference, and two siblings from a self-crossed of this strain were found to share only ~90% of their insertions. Private insertions are primarily heterozygous but include both homozygous and predicted somatic insertions. The reliability of the predicted genotypes was validated by polymerase chain reaction.
Keywords: NGS; bioinformatics; genomics; rice; transposable element.