The genome sequence of the outbreeding globe artichoke constructed de novo incorporating a phase-aware low-pass sequencing strategy of F1 progeny

Sci Rep. 2016 Jan 20:6:19427. doi: 10.1038/srep19427.

Abstract

Globe artichoke (Cynara cardunculus var. scolymus) is an out-crossing, perennial, multi-use crop species that is grown worldwide and belongs to the Compositae, one of the most successful Angiosperm families. We describe the first genome sequence of globe artichoke. The assembly, comprising of 13,588 scaffolds covering 725 of the 1,084 Mb genome, was generated using ~133-fold Illumina sequencing data and encodes 26,889 predicted genes. Re-sequencing (30×) of globe artichoke and cultivated cardoon (C. cardunculus var. altilis) parental genotypes and low-coverage (0.5 to 1×) genotyping-by-sequencing of 163 F1 individuals resulted in 73% of the assembled genome being anchored in 2,178 genetic bins ordered along 17 chromosomal pseudomolecules. This was achieved using a novel pipeline, SOILoCo (Scaffold Ordering by Imputation with Low Coverage), to detect heterozygous regions and assign parental haplotypes with low sequencing read depth and of unknown phase. SOILoCo provides a powerful tool for de novo genome analysis of outcrossing species. Our data will enable genome-scale analyses of evolutionary processes among crops, weeds, and wild species within and beyond the Compositae, and will facilitate the identification of economically important genes from related species.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Breeding*
  • Chromosome Mapping
  • Computational Biology / methods
  • Cynara scolymus / genetics*
  • DNA, Satellite
  • Genome, Plant*
  • Genomics / methods
  • High-Throughput Nucleotide Sequencing*
  • MicroRNAs / genetics
  • Microsatellite Repeats
  • Molecular Sequence Annotation
  • Multigene Family
  • Repetitive Sequences, Nucleic Acid

Substances

  • DNA, Satellite
  • MicroRNAs