Big Data 4 min read

From Data Chaos to Predictive Insight: My Solo Journey in the 2025 Big Data Competition

An individual participant recounts their journey in the 2025 China University Computer Competition Big Data Challenge, detailing data cleaning, feature engineering, model building on 300‑stock historical prices, and insights gained from solo competition experience, highlighting challenges, lessons, and future directions in financial AI.

Data Party THU
Data Party THU
Data Party THU
From Data Chaos to Predictive Insight: My Solo Journey in the 2025 Big Data Competition

The 2025 China University Computer Competition featured a Big Data Challenge centered on financial data. Participants were required to use historical price data of the Shanghai‑Shenzhen 300 index components to build machine learning models that predict the next trading day's largest and smallest price movements among ten selected stocks.

As a first‑time competitor, I joined the team "抹香鲸cmr2" led by Zhou Xiheng from China University of Petroleum (East). Our team achieved a national ranking of seventh. Throughout the preparation, I started from scratch to design a data processing pipeline, learning how to clean high‑frequency financial time‑series data and perform extensive feature engineering.

Key features I constructed included price change rates, volume variations, volatility measures, and momentum indicators, forming a multi‑dimensional feature set. I experimented with various supervised and unsupervised learning models, focusing on model stability and robustness during the tuning phase. Efforts were made to avoid over‑fitting while improving the ability to identify extreme price swings.

Competing as a solo participant added isolation and difficulty. Every step—from data exploration to model deployment—required independent decision‑making and repeated validation. Lacking immediate team discussions, I relied heavily on academic literature, open‑source projects, and community forums to broaden my approach. Each submission’s ranking feedback served as a crucial signal for further model refinement.

The experience highlighted the inherent complexity and uncertainty of financial forecasting. Relying solely on historical numerical data imposes natural limits, yet this uncertainty drives continuous reflection on model boundaries and improvement strategies. I learned to extract effective signals from limited information and deepened my understanding of model evaluation and risk control.

Overall, the competition significantly enhanced my data processing, modeling practice, and problem‑abstraction abilities. It also sparked a lasting interest in quantitative analysis and the application of artificial intelligence in finance. I plan to pursue more robust and interpretable prediction methods and look forward to engaging with higher‑level competitive platforms and peers.

data engineeringBig DataCompetitionquantitative analysisfinancial forecasting
Data Party THU
Written by

Data Party THU

Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.