Deep sequencing technologies are completely revolutionizing the approach to DNA analysis. Mitochondrial DNA (mtDNA) studies entered in the "postgenomic era": the burst in sequenced samples observed in nuclear genomics is expected also in mitochondria, a trend that can already be detected checking complete mtDNA sequences database submission rate. Tools for the analysis of these data are available, but they fail in throughput or in easiness of use. We present here a new pipeline based on previous algorithms, inherited from the "nuclear genomic toolbox," combined with a newly developed algorithm capable of efficiently and easily classify new mtDNA sequences according to PhyloTree nomenclature. Detected mutations are also annotated using data collected from publicly available databases. Thanks to the analysis of all freely available sequences with known haplogroup obtained from GenBank, we were able to produce a PhyloTree-based weighted tree, taking into account each haplogroup pattern conservation. The combination of a highly efficient aligner, coupled with our algorithm and massive usage of asynchronous parallel processing, allowed us to build a high-throughput pipeline for the analysis of mtDNA sequences that can be quickly updated to follow the ever-changing nomenclature. HaploFind is freely accessible at the following Web address: https://haplofind.unibo.it.
Keywords: NGS; Web service; haplogroup; haplogroup assignment; mtDNA.
© 2013 WILEY PERIODICALS, INC.