Motivation: Reconstructing gene regulatory networks (GRNs) based on gene expression profiles is still an enormous challenge in systems biology. Random forest-based methods have been proved a kind of efficient methods to evaluate the importance of gene regulations. Nevertheless, the accuracy of traditional methods can be further improved. With time-series gene expression data, exploiting inherent time information and high order time lag are promising strategies to improve the power and accuracy of GRNs inference.
Results: In this study, we propose a scalable, flexible approach called BiXGBoost to reconstruct GRNs. BiXGBoost is a bidirectional-based method by considering both candidate regulatory genes and target genes for a specific gene. Moreover, BiXGBoost utilizes time information efficiently and integrates XGBoost to evaluate the feature importance. Randomization and regularization are also applied in BiXGBoost to address the over-fitting problem. The results on DREAM4 and Escherichia coli datasets show the good performance of BiXGBoost on different scale of networks.
Availability and implementation: Our Python implementation of BiXGBoost is available at https://github.com/zrq0123/BiXGBoost.
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.