Synteny-based mapping-by-sequencing enabled by targeted enrichment

Plant J. 2012 Aug;71(3):517-26. doi: 10.1111/j.1365-313X.2012.04993.x. Epub 2012 May 14.

Abstract

Mapping-by-sequencing, as implemented in SHOREmap ('SHOREmapping'), is greatly accelerating the identification of causal mutations. The original SHOREmap approach based on resequencing of bulked segregants required a highly accurate and complete reference sequence. However, current whole-genome or transcriptome assemblies from next-generation sequencing data of non-model organisms do not produce chromosome-length scaffolds. We have therefore developed a method that exploits synteny with a related genome for genetic mapping. We first demonstrate how mapping-by-sequencing can be performed using a reduced number of markers, and how the associated decrease in the number of markers can be compensated for by enrichment of marker sequences. As proof of concept, we apply this method to Arabidopsis thaliana gene models ordered by synteny with the genome sequence of the distant relative Brassica rapa, whose genome has several large-scale rearrangements relative to A. thaliana. Our approach provides an alternative method for high-resolution genetic mapping in species that lack finished genome reference sequences or for which only RNA-seq assemblies are available. Finally, for improved identification of causal mutations by fine-mapping, we introduce a new likelihood ratio test statistic, transforming local allele frequency estimations into a confidence interval similar to conventional mapping intervals.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arabidopsis / genetics*
  • Arabidopsis Proteins
  • Brassica rapa / genetics*
  • Chromosome Mapping / methods*
  • DNA Mutational Analysis
  • DNA, Plant / chemistry
  • DNA, Plant / genetics
  • Flowers / genetics
  • Gene Frequency
  • Gene Library
  • Genetic Linkage
  • Genome, Plant / genetics*
  • High-Throughput Nucleotide Sequencing / methods
  • MADS Domain Proteins
  • Mutation
  • Sequence Analysis, DNA / methods
  • Synteny / genetics*
  • Transcriptome

Substances

  • AGL20 protein, Arabidopsis
  • Arabidopsis Proteins
  • DNA, Plant
  • MADS Domain Proteins