Exploring repetitive DNA landscapes using REPCLASS, a tool that automates the classification of transposable elements in eukaryotic genomes
- PMID: 20333191
- PMCID: PMC2817418
- DOI: 10.1093/gbe/evp023
Exploring repetitive DNA landscapes using REPCLASS, a tool that automates the classification of transposable elements in eukaryotic genomes
Abstract
Eukaryotic genomes contain large amount of repetitive DNA, most of which is derived from transposable elements (TEs). Progress has been made to develop computational tools for ab initio identification of repeat families, but there is an urgent need to develop tools to automate the annotation of TEs in genome sequences. Here we introduce REPCLASS, a tool that automates the classification of TE sequences. Using control repeat libraries, we show that the program can classify accurately virtually any known TE types. Combining REPCLASS to ab initio repeat finding in the genomes of Caenorhabditis elegans and Drosophila melanogaster allowed us to recover the contrasting TE landscape characteristic of these species. Unexpectedly, REPCLASS also uncovered several novel TE families in both genomes, augmenting the TE repertoire of these model species. When applied to the genomes of distant Caenorhabditis and Drosophila species, the approach revealed a remarkable conservation of TE composition profile within each genus, despite substantial interspecific covariations in genome size and in the number of TEs and TE families. Lastly, we applied REPCLASS to analyze 10 fungal genomes from a wide taxonomic range, most of which have not been analyzed for TE content previously. The results showed that TE diversity varies widely across the fungi "kingdom" and appears to positively correlate with genome size, in particular for DNA transposons. Together, these data validate REPCLASS as a powerful tool to explore the repetitive DNA landscapes of eukaryotes and to shed light onto the evolutionary forces shaping TE diversity and genome architecture.
Keywords: genome annotation; repeat classification; repetitive elements; transposable elements; transposons.
Figures
Similar articles
-
Exploration of the Drosophila buzzatii transposable element content suggests underestimation of repeats in Drosophila genomes.BMC Genomics. 2016 May 10;17:344. doi: 10.1186/s12864-016-2648-8. BMC Genomics. 2016. PMID: 27164953 Free PMC article.
-
Comparative analysis of morabine grasshopper genomes reveals highly abundant transposable elements and rapidly proliferating satellite DNA repeats.BMC Biol. 2020 Dec 21;18(1):199. doi: 10.1186/s12915-020-00925-x. BMC Biol. 2020. PMID: 33349252 Free PMC article.
-
Combined evidence annotation of transposable elements in genome sequences.PLoS Comput Biol. 2005 Jul;1(2):166-75. doi: 10.1371/journal.pcbi.0010022. Epub 2005 Jul 29. PLoS Comput Biol. 2005. PMID: 16110336 Free PMC article.
-
[Computational approaches for identification and classification of transposable elements in eukaryotic genomes].Yi Chuan. 2012 Aug;34(8):1009-19. doi: 10.3724/sp.j.1005.2012.01009. Yi Chuan. 2012. PMID: 22917906 Review. Chinese.
-
Transposable elements in reptilian and avian (sauropsida) genomes.Cytogenet Genome Res. 2009;127(2-4):94-111. doi: 10.1159/000294999. Epub 2010 Mar 6. Cytogenet Genome Res. 2009. PMID: 20215725 Review.
Cited by
-
Fine-grained annotation and classification of de novo predicted LTR retrotransposons.Nucleic Acids Res. 2009 Nov;37(21):7002-13. doi: 10.1093/nar/gkp759. Nucleic Acids Res. 2009. PMID: 19786494 Free PMC article.
-
A machine learning based framework to identify and classify long terminal repeat retrotransposons.PLoS Comput Biol. 2018 Apr 23;14(4):e1006097. doi: 10.1371/journal.pcbi.1006097. eCollection 2018 Apr. PLoS Comput Biol. 2018. PMID: 29684010 Free PMC article.
-
Bioinformatics and genomic analysis of transposable elements in eukaryotic genomes.Chromosome Res. 2011 Aug;19(6):787-808. doi: 10.1007/s10577-011-9230-7. Chromosome Res. 2011. PMID: 21850457
-
Improved Genome Assembly and Annotation for the Rock Pigeon (Columba livia).G3 (Bethesda). 2018 May 4;8(5):1391-1398. doi: 10.1534/g3.117.300443. G3 (Bethesda). 2018. PMID: 29519939 Free PMC article.
-
RepeatModeler2 for automated genomic discovery of transposable element families.Proc Natl Acad Sci U S A. 2020 Apr 28;117(17):9451-9457. doi: 10.1073/pnas.1921046117. Epub 2020 Apr 16. Proc Natl Acad Sci U S A. 2020. PMID: 32300014 Free PMC article.
References
LinkOut - more resources
Full Text Sources
Molecular Biology Databases
