Comparison of taxon-specific versus general locus sets for targeted sequence capture in plant phylogenomics

Appl Plant Sci. 2018 Mar 31;6(3):e1032. doi: 10.1002/aps3.1032. eCollection 2018 Mar.

Abstract

Premise of the study: Targeted sequence capture can be used to efficiently gather sequence data for large numbers of loci, such as single-copy nuclear loci. Most published studies in plants have used taxon-specific locus sets developed individually for a clade using multiple genomic and transcriptomic resources. General locus sets can also be developed from loci that have been identified as single-copy and have orthologs in large clades of plants.

Methods: We identify and compare a taxon-specific locus set and three general locus sets (conserved ortholog set [COSII], shared single-copy nuclear [APVO SSC] genes, and pentatricopeptide repeat [PPR] genes) for targeted sequence capture in Buddleja (Scrophulariaceae) and outgroups. We evaluate their performance in terms of assembly success, sequence variability, and resolution and support of inferred phylogenetic trees.

Results: The taxon-specific locus set had the most target loci. Assembly success was high for all locus sets in Buddleja samples. For outgroups, general locus sets had greater assembly success. Taxon-specific and PPR loci had the highest average variability. The taxon-specific data set produced the best-supported tree, but all data sets showed improved resolution over previous non-sequence capture data sets.

Discussion: General locus sets can be a useful source of sequence capture targets, especially if multiple genomic resources are not available for a taxon.

Keywords: Buddleja; Lamiales; PPR genes; Scrophulariaceae; hybrid enrichment; single‐copy nuclear genes.

Associated data

  • Dryad/10.5061/dryad.v6q0p