Vanillin, the principal aromatic compound in vanilla, is primarily derived from mature pods of vanilla (Vanilla planifolia Andrews). Although the biosynthetic pathway of vanillin has been progressively elucidated, the specific key enzymes and transcription factors (TFs) governing vanillin biosynthesis require further comprehensive investigation via combining transcriptomic and metabolomic analysis. For this study, V. planifolia (higher vanillin producer) and V. imperialis (lower vanillin producer) were selected. Time-series metabolomics analysis revealed 160-220 days after pollination (DAPs) as the critical phase for vanillin biosynthesis. Combined time-series transcriptome analysis revealed 984 upregulated differentially expressed genes (DEGs) in key periods, 2058 genes with temporal expression, and 4326 module genes through weighted gene co-expression network analysis (WGCNA), revealing six major classes of TFs: No Apical Meristem (NAC), Myb, WRKY, FLOWERING PROMOTING FACTOR 1-like (FPFL), DOF, and PLATZ. These TFs display strong regulatory relationships with the expression of key enzymatic genes, including P450s, COMT, and 4CL. The NAC TF family emerged as central regulators in this network, with NAC-2 (HPP92_014056) and NAC-3 (HPP92_012558) identified as key hub genes within the vanillin biosynthetic gene co-expression network. The findings of this study provide a theoretical foundation and potential target genes for enhancing vanillin production through genetic and metabolic engineering approaches, offering new opportunities for sustainable development in the vanilla industry and related applications.
Keywords: Vanilla planifolia Andrews; combined analysis; transcriptome and metabolome; vanillin.