ZeoSyn: A Comprehensive Zeolite Synthesis Dataset Enabling Machine-Learning Rationalization of Hydrothermal Parameters

ACS Cent Sci. 2024 Mar 6;10(3):729-743. doi: 10.1021/acscentsci.3c01615. eCollection 2024 Mar 27.


Zeolites, nanoporous aluminosilicates with well-defined porous structures, are versatile materials with applications in catalysis, gas separation, and ion exchange. Hydrothermal synthesis is widely used for zeolite production, offering control over composition, crystallinity, and pore size. However, the intricate interplay of synthesis parameters necessitates a comprehensive understanding of synthesis-structure relationships to optimize the synthesis process. Hitherto, public zeolite synthesis databases only contain a subset of parameters and are small in scale, comprising up to a few thousand synthesis routes. We present ZeoSyn, a dataset of 23,961 zeolite hydrothermal synthesis routes, encompassing 233 zeolite topologies and 921 organic structure-directing agents (OSDAs). Each synthesis route comprises comprehensive synthesis parameters: 1) gel composition, 2) reaction conditions, 3) OSDAs, and 4) zeolite products. Using ZeoSyn, we develop a machine learning classifier to predict the resultant zeolite given a synthesis route with >70% accuracy. We employ SHapley Additive exPlanations (SHAP) to uncover key synthesis parameters for >200 zeolite frameworks. We introduce an aggregation approach to extend SHAP to all building units. We demonstrate applications of this approach to phase-selective and intergrowth synthesis. This comprehensive analysis illuminates the synthesis parameters pivotal in driving zeolite crystallization, offering the potential to guide the synthesis of desired zeolites. The dataset is available at https://github.com/eltonpan/zeosyn_dataset.