Big Data 5 min read

Our Winning Strategies for the 2025 China University Big Data Stock Prediction Contest

Competing in the 2025 China University Big Data Challenge, our interdisciplinary team leveraged GRU-based time‑series modeling, XGBoost feature fusion, risk‑based clustering, and a custom asymmetric loss to achieve a fourth‑place finish while highlighting the limitations of relying solely on SHSE 300 stock data.

Data Party THU
Data Party THU
Data Party THU
Our Winning Strategies for the 2025 China University Big Data Stock Prediction Contest

Team Name: Chef Team

Team Members: Su Zhihan (Tsinghua University), Liu Shuangshi (Tsinghua University), Xie Linfeng (Tsinghua University)

Ranking: 4th place nationwide

The 2025 China University Computer Competition – Big Data Challenge required participants to forecast the prices of the 300 constituent stocks of the Shanghai and Shenzhen stock exchanges. The dataset offered high quality, clear objectives, and well‑defined features, providing a solid foundation for modeling.

Our approach combined several innovations:

We introduced a GRU‑based deep time‑series model to capture temporal patterns and used its predictions as enhanced features for traditional tree models such as XGBoost.

Stocks were grouped into low, medium, and high risk categories based on historical volatility, and separate models were trained for each cluster to improve generalization across market conditions.

A custom asymmetric loss function was designed to give higher weight to extreme price movements, increasing sensitivity to large fluctuations.

Finally, we built a two‑stage ensemble that merged the GRU model with XGBoost, leveraging the strengths of deep learning and gradient‑boosted trees.

This pipeline secured a fourth‑place finish in the competition, confirming the effectiveness of multi‑source feature fusion and risk‑aware modeling.

Beyond the ranking, the competition deepened our understanding of end‑to‑end data‑science workflows, from data preprocessing and feature engineering to model selection and hyper‑parameter tuning. We also recognized that relying solely on historical price data is insufficient; incorporating fundamentals, financial reports, sentiment, and market‑wide indicators would likely boost model robustness.

Future work includes expanding the feature set, experimenting with more diverse neural architectures, refining ensemble strategies, and further optimizing the asymmetric loss to better capture extreme market events.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

GRUXGBoostensemble learning
Data Party THU
Written by

Data Party THU

Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.