Big Data 8 min read

How a Tsinghua Big Data Program Turned a Chemistry PhD into an AI‑Powered Process Engineer

This article recounts a Tsinghua University PhD student's journey through a multidisciplinary big‑data training program, detailing the acquisition of AI and data‑science skills, the creation of novel algorithms like MicroFlowSAM and ImageRAG, and their successful application to chemical engineering research and industry projects.

Data Party THU
Data Party THU
Data Party THU
How a Tsinghua Big Data Program Turned a Chemistry PhD into an AI‑Powered Process Engineer

To leverage Tsinghua University's multidisciplinary strengths, the Graduate School, the Big Data Research Center, and related departments launched the "Tsinghua Big Data Capability Improvement Project". The program integrates big‑data thinking, cross‑disciplinary learning, and hands‑on practice into a hybrid online‑offline curriculum, markedly boosting students' data‑analysis and innovative application abilities.

Entering the Process Systems Engineering (PSE) Institute for a PhD, the author recognized that traditional chemical engineering research was being reshaped by AI. Lacking solid computer‑science foundations, the author used the big‑data project to fill gaps in data handling, algorithmic understanding, and practical AI deployment.

Learning : Courses such as "Big Data Analysis (B)" introduced the full workflow of data cleaning, feature engineering, model building, and evaluation, while the "Deep Learning" course covered core architectures (CNN, RNN, Transformer) and their mathematical foundations. A team project applied these skills to bubble and droplet recognition in micro‑chemical scenarios, laying groundwork for later research papers.

Research : Building on the acquired knowledge, the author developed the MicroFlowSAM algorithm, which achieves high‑precision, zero‑label, zero‑training segmentation of high‑speed liquid‑droplet videos. The method earned a second‑place award at the 2024 Chinese Process Systems Engineering Conference (CPSE) and is being prepared for SCI journal publication. Additionally, the author introduced an active‑learning sampling strategy and incorporated mechanistic model gradients into neural‑network training for catalytic cracking, demonstrating a physics‑informed modeling approach presented at the international ESCAPE35 conference.

Industrial Practice : During a summer internship at Procter & Gamble, the author led the development of the ImageRAG dynamic reference‑guided image generation system. By converting business requirements into precise prompts via a large language model and employing a vision‑language model as an automated evaluator, the system creates a generate‑evaluate‑correct loop that reduces scientific image creation from weeks to minutes, with a front‑end visual interface for designers.

Subsequently, the author contributed to a synthetic‑ammonia plant project, building a comprehensive time‑series prediction framework that spans data cleaning, variable selection, feature engineering, and model validation. This framework delivers high‑accuracy real‑time forecasts of key process indicators, enabling proactive adjustments that improve efficiency and lower operational risk.

Looking ahead, the author plans to deepen research in intelligent Process Systems Engineering (iPSE), exploring data‑driven autonomous optimization and decision‑making for chemical processes, aiming to bridge the gap between mechanistic models and big‑data analytics and to contribute to the digital transformation of China's chemical industry.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big DataIndustrial ApplicationChemical EngineeringProcess Systems Engineering
Data Party THU
Written by

Data Party THU

Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.