Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Dec 5;8:180.
doi: 10.3389/fgene.2017.00180. eCollection 2017.

Draft Sequencing of the Heterozygous Diploid Genome of Satsuma ( Citrus unshiu Marc.) Using a Hybrid Assembly Approach

Affiliations
Free PMC article

Draft Sequencing of the Heterozygous Diploid Genome of Satsuma ( Citrus unshiu Marc.) Using a Hybrid Assembly Approach

Tokurou Shimizu et al. Front Genet. .
Free PMC article

Abstract

Satsuma (Citrus unshiu Marc.) is one of the most abundantly produced mandarin varieties of citrus, known for its seedless fruit production and as a breeding parent of citrus. De novo assembly of the heterozygous diploid genome of Satsuma ("Miyagawa Wase") was conducted by a hybrid assembly approach using short-read sequences, three mate-pair libraries, and a long-read sequence of PacBio by the PLATANUS assembler. The assembled sequence, with a total size of 359.7 Mb at the N50 length of 386,404 bp, consisted of 20,876 scaffolds. Pseudomolecules of Satsuma constructed by aligning the scaffolds to three genetic maps showed genome-wide synteny to the genomes of Clementine, pummelo, and sweet orange. Gene prediction by modeling with MAKER-P proposed 29,024 genes and 37,970 mRNA; additionally, gene prediction analysis found candidates for novel genes in several biosynthesis pathways for gibberellin and violaxanthin catabolism. BUSCO scores for the assembled scaffold and predicted transcripts, and another analysis by BAC end sequence mapping indicated the assembled genome consistency was close to those of the haploid Clementine, pummel, and sweet orange genomes. The number of repeat elements and long terminal repeat retrotransposon were comparable to those of the seven citrus genomes; this suggested no significant failure in the assembly at the repeat region. A resequencing application using the assembled sequence confirmed that both kunenbo-A and Satsuma are offsprings of Kishu, and Satsuma is a back-crossed offspring of Kishu. These results illustrated the performance of the hybrid assembly approach and its ability to construct an accurate heterozygous diploid genome.

Keywords: Satsuma; carotenoid biosynthesis; citrus; draft genome assembly; gene prediction; genome synteny; gibberellic acid biosynthesis; parentage analysis.

Figures

Figure 1
Figure 1
Pseudomolecule construction of Satsuma by aligning the scaffolds to the three genetic maps. Chr 1 to 9 represents constructed pseudomolecules by merging three genetic maps of Satsuma offsprings. Numbers in parenthesis indicate the length of constructed pseudomolecule. Central round rectangle is a schematic diagram of the merged pseudomolecule. P1 (green), P2 (orange), and P3 (blue) of each side correspond to the genetic maps of population 1 (161 SSR, 512 SNP, 957 cM), population 2 (349 SSR, 476 SNP, 1,017 cM), and population 3 (278 SSR, 919 cM), respectively. Each line denotes a DNA marker that was mapped to the scaffold and applied for scaffold assembly.
Figure 2
Figure 2
2-D genome synteny plot among Satsuma pseudomolecule and three reference citrus genomes.Satsuma: 9 pseudomolecules (this study), Clementine: scaffold 1 to 9 from Wu et al. (2014). Pummelo and sweet orange: chromosome 1 to 9 from Wang et al. (2017). Numbers on each axis correspond to their numbers of chromosome or scaffold.
Figure 3
Figure 3
The bipartite spider plot of the ratio of orthologous gene of Satsuma to the seven citrus genomes and Arabidopsis for their protein coding sequence and translated sequence. Individual symbols indicate the ratio of primary genes that were orthologs to seven reference genomes for their nucleotide sequence (CDS; right) by BLASTN or protein sequence (Protein; left) by BLASTP. Half circle lines on the left and right sides represent the ratio at the central (%). Each line shows the ratio at threshold E-value for the homology evaluation.
Figure 4
Figure 4
Distribution of the deduced gene ontology (GO) of the Satsuma primary genes. Each panel represents the number of GO slim annotations for molecular function, biological process, and cellular component for the predicted coding genes of the Satsuma mandarin. They were retrieved by similarity to the curated cDNA of Arabidopsis with the threshold of E-value ≤ 1 × 10−20.
Figure 5
Figure 5
Genes involved in the biosynthesis and deactivation of bioactive gibberellic acid in Satsuma. GGPP, geranylgeranyl diphosphate; GAxx, gibberellic acid xx; CPS, ent-copalyl diphosphate synthase; KS, ent-kaurene synthase; KO, ent-kaurene oxidase; KAO, ent-kaurenoic acid oxidase; GA13ox, gibberellin 13-oxidase, putative; GA20ox, gibberellin 20-oxidase; GA2ox, gibberellin 2-oxidase; GA3ox, gibberellin 3-oxidase. Numbers in parentheses represent the number of each detected gene.
Figure 6
Figure 6
Genes involved in the biosynthesis of carotenoids and abscisic acid in Satsuma. GGPP, geranylgeranyl diphosphate; PSY, phytoene synthase; PDS, phytoene desaturase; Z-ISO, 15-cis-zeta-carotene isomerase; ZDS, zeta-carotene desaturase; CRTISO, carotenoid isomerase/prolycopene isomerase; LCYE, lycopene epsilon-cyclase; LCYB, lycopene beta-cyclase; CYP97C, carotene epsilon-monooxygenase; CYP97A, beta-carotene 3-hydroxylase; CHYB, beta-carotene 3-hydroxylase; ZEP, zeaxanthin epoxidase; VDE, violaxanthin de-epoxidase; CCD, carotenoid cleavage dioxygenase; NCED, 9-cis-epoxycarotenoid dioxygenase; NXS, Neoxanthin synthase, putative; ABA, xanthoxin dehydrogenase; AAO, abscisic-aldehyde oxidase. Numbers in parentheses represent the number of each detected gene.
Figure 7
Figure 7
Genome-wide parentage analysis of Satsuma and Clementine trios. (A) The pedigree of Satsuma as an offspring of Kishu (seed parent) and kunenbo-A (pollen parent). Kunenbo-A was also considered to be an offspring of Kishu, with an unidentified variety as the seed parent (Shimizu et al., 2016b). (B) The pedigree of the Clementine as the offspring of Willowleaf mandarin (seed parent) and sweet orange (pollen parent). The numbers under the variety name in rounded rectangles are heterozygous SNPs for each, and the % ratio in the parentheses are the site coverage ratios of the detected variant sites to all variant sites. The numbers in the rectangles represent the number of SNPs that were inconsistent among the trio by parentage analysis at each cross pointed with circles, and the numbers in parentheses are the ratios of the inconsistent SNP sites to the valid SNPs.

Similar articles

See all similar articles

Cited by 9 articles

See all "Cited by" articles

References

    1. Alós E., Cercós M., Rodrigo M. J., Zacarías L., Talón M. (2006). Regulation of color break in citrus fruits. Changes in pigment profiling and gene expression induced by gibberellins and nitrate, two ripening retardants. J. Agric. Food Chem. 54, 4888–4895. 10.1021/jf0606712 - DOI - PubMed
    1. Alquézar B., Rodrigo M. J., Zacarías L. (2008). Carotenoid biosynthesis and their regulation in Citrus fruits. Tree For. Sci. Biotechnol. 2, 23–35.
    1. Bairoch A., Apweiler R., Wu C. H., Barker W. C., Boeckmann B., Ferro S., et al. . (2005). The Universal Protein Resource (UniProt). Nucleic Acids Res. 33, D154–D159. 10.1093/nar/gki070 - DOI - PMC - PubMed
    1. Bak S., Beisson F., Bishop G., Hamberger B., Höfer R., Paquette S., et al. . (2011). Cytochromes P450. Arabidopsis Book 9:e0144. 10.1199/tab.0144 - DOI - PMC - PubMed
    1. Bausher M. G., Singh N. D., Lee S.-B., Jansen R. K., Daniell H. (2006). The complete chloroplast genome sequence of Citrus sinensis (L.) Osbeck var “Ridge Pineapple”: organization and phylogenetic relationships to other angiosperms. BMC Plant Biol. 6:21 10.1186/1471-2229-6-21 - DOI - PMC - PubMed

LinkOut - more resources

Feedback