Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations

Cell. 2021 Jun 24;184(13):3542-3558.e16. doi: 10.1016/j.cell.2021.04.046. Epub 2021 May 28.

Abstract

Structural variations (SVs) and gene copy number variations (gCNVs) have contributed to crop evolution, domestication, and improvement. Here, we assembled 31 high-quality genomes of genetically diverse rice accessions. Coupling with two existing assemblies, we developed pan-genome-scale genomic resources including a graph-based genome, providing access to rice genomic variations. Specifically, we discovered 171,072 SVs and 25,549 gCNVs and used an Oryza glaberrima assembly to infer the derived states of SVs in the Oryza sativa population. Our analyses of SV formation mechanisms, impacts on gene expression, and distributions among subpopulations illustrate the utility of these resources for understanding how SVs and gCNVs shaped rice environmental adaptation and domestication. Our graph-based genome enabled genome-wide association study (GWAS)-based identification of phenotype-associated genetic variations undetectable when using only SNPs and a single reference assembly. Our work provides rich population-scale resources paired with easy-to-access tools to facilitate rice breeding as well as plant functional genomics and evolutionary biology research.

Keywords: Gene copy number variation; Graph-based genome; High-quality assembly; Pan-genome; Rice; Structural variation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adaptation, Physiological / genetics
  • Agriculture
  • Domestication
  • Ecotype*
  • Gene Expression Profiling
  • Gene Expression Regulation, Plant
  • Genes, Plant
  • Genetic Variation*
  • Genome, Plant*
  • Genomic Structural Variation
  • Molecular Sequence Annotation
  • Oryza / genetics*
  • Phenotype