Background: Jatropha curcas L. is an important non-edible oilseed crop with a promising future in biodiesel production. However, little is known about the molecular biology of oil biosynthesis in this plant when compared with other established oilseed crops, resulting in the absence of agronomically improved varieties of Jatropha. To extensively discover the potentially novel genes and pathways associated with the oil biosynthesis in J. curcas, new strategy other than homology alignment is on the demand.
Results: In this study, we proposed a multi-step computational framework that integrates transcriptome and gene interactome data to predict functional pathways in non-model organisms in an extended process, and applied it to study oil biosynthesis pathway in J. curcas. Using homologous mapping against Arabidopsis and transcriptome profile analysis, we first constructed protein-protein interaction (PPI) and co-expression networks in J. curcas. Then, using the homologs of Arabidopsis oil-biosynthesis-related genes as seeds, we respectively applied two algorithm models, random walk with restart (RWR) in PPI network and negative binomial distribution (NBD) in co-expression network, to further extend oil-biosynthesis-related pathways and genes in J. curcas. At last, using k-nearest neighbors (KNN) algorithm, the predicted genes were further classified into different sub-pathways according to their possible functional roles.
Conclusions: Our method exhibited a highly efficient way of mining the extended oil biosynthesis pathway of J. curcas. Overall, 27 novel oil-biosynthesis-related gene candidates were predicted and further assigned to 5 sub-pathways. These findings can help better understanding of the oil biosynthesis pathway of J. curcas, as well as paving the way for the following J. curcas breeding application.
Keywords: Extended mining; Gene interactome; Jatropha curcas; Oil biosynthesis; Transcriptome.
© 2021. The Author(s).