MIT and Partners Use 23k+ Recipes and Diffusion Models to Create Zeolites with Si/Al = 19
The study introduces DiffSyn, a generative diffusion model trained on 23,961 zeolite synthesis recipes spanning over 50 years, which outperforms regression and other generative baselines, accurately predicts synthesis routes, and experimentally validates a novel UFI zeolite with a record Si/Al ratio of 19.
Training Data and Dataset
The core dataset, ZeoSyn, contains 23,961 hydrothermal synthesis routes for 233 zeolite topologies and 921 organic structure‑directing agents (OSDAs), collected from more than 50 years of literature.
Model Architecture and Chemical Guidance
DiffSyn adopts a generative diffusion framework with a dual‑encoder architecture: separate encoders (Enczeo and EncOSDA) process zeolite structures and OSDAs, while a fusion encoder learns joint representations. Chemical guidance is injected via a classifier‑free conditioning strategy that steers the reverse diffusion process toward chemically plausible routes.
During training, Gaussian noise is added to composition (Xcomp) and condition (Xcond) vectors (forward diffusion). In inference, the reverse diffusion iteratively denoises from pure noise using a U‑Net conditioned on the target zeolite and OSDA, producing a distribution over synthesis parameters rather than a single deterministic output.
Experimental Evaluation
DiffSyn was benchmarked against three baseline families: regression models (AMD, BNN), classic generative models (GMM), and deep generative models (GAN, NF, VAE). Using Wasserstein distance and coverage‑F1 (COV‑F1) as metrics, DiffSyn achieved the lowest Wasserstein distance and the highest precision coverage (COV‑P), improving over the best deep baseline (VAE) by more than 25%.
For 12 synthesis parameters, DiffSyn attained the lowest mean absolute error on 10 of them, surpassing all baselines.
Case Studies on Unseen Zeolite–OSDA Systems
In MWW, DiffSyn generated OH⁻/T, K⁺/T, H₂O/T, SDA/T, temperature and time values that closely matched literature reports, demonstrating accurate extrapolation.
In BEC, the model reproduced Si/Ge, F⁻/T, and temperature/time conditions, correctly capturing the role of Ge and F⁻ in stabilizing the double‑four‑ring (d4r) unit.
For FAU and LTA without OSDAs, DiffSyn precisely predicted the phase‑boundary region, delineating competitive crystal‑phase formation spaces.
Optimal Route Generation
Using TMAda as the OSDA for CHA zeolite, DiffSyn generated pre‑optimal routes that exhibited shorter crystallization times and lower precursor costs compared with the 20 lowest‑cost literature routes.
Experimental Validation of UFI Zeolite
DiffSyn suggested an OSDA (K222) not present in the training set for the UFI framework. Four UFI samples were synthesized; X‑ray diffraction matched simulated patterns, and ICP measurements confirmed a Si/Al ratio of 19.0, the highest reported for UFI zeolites.
The study emphasizes that human expert intervention, combined with AI‑generated routes, yields the best experimental outcomes.
Broader Impact
The work illustrates how generative AI can bridge the “what to synthesize” (high‑throughput screening) and “how to synthesize” (recipe planning) gaps in materials science, providing a scalable, data‑driven pathway to discover and realize new functional materials.
Reference: DiffSyn: a generative diffusion approach to materials synthesis planning, Nature Computational Science (2025).
HyperAI Super Neural
Deconstructing the sophistication and universality of technology, covering cutting-edge AI for Science case studies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
