Artificial Intelligence 6 min read

SkullGreymon Team’s Progress and Technical Insights in the Tencent Social Ads Algorithm Competition

The SkullGreymon team, winners of the biggest improvement award in the Tencent Social Ads university algorithm competition, share their journey from a late start in the preliminaries to significant performance gains in the finals, detailing memory‑saving feature extraction techniques, pandas and numpy usage, and their XGBoost modeling approach.

Tencent Advertising Technology

Jun 17, 2017

SkullGreymon Team’s Progress and Technical Insights in the Tencent Social Ads Algorithm Competition

We are the SkullGreymon team – a name inspired by a zombie dragon monster, chosen not because we are gloomy but because we are three good citizens living in a youth dormitory. We entered the competition late, only starting in the final week of the preliminaries, and despite a risky submission that seemed to secure a spot in the semifinals, our ranking fell to 173, almost missing the next round. However, this setback led us to earn the "biggest improvement" award.

During the semifinals, the most impactful improvement came from a mysterious "trick" mentioned in the competition group chat. By correctly interpreting and applying this feature, we gained two per‑mille points, which matched the boost reported by others and became our primary progress driver.

Another advantage was our ability to handle the larger dataset efficiently. While many contestants struggled with memory limits, we quickly processed the preliminary features and models, staying ahead in the early stages of the semifinals. Our feature extraction relied entirely on pandas and numpy, and we adopted several memory‑saving strategies:

1. We partitioned the data by day and retained only the apps, creative IDs, position IDs, etc., that appeared on that day before merging other tables, drastically reducing computation.

2. Before merging two tables, we kept only the necessary columns (e.g., for position‑ID statistics we removed unrelated columns such as connectionType or creativeID).

3. All extracted features were stored as scipy csr_matrix objects, which convert to pandas DataFrames quickly; the final features were saved to disk with numpy.savez for fast loading.

For modeling, both the preliminaries and semifinals used a single XGBoost model. We briefly experimented with Factorization Machines and other models, but they did not outperform XGBoost. Based on past CTR competition experience, models like FFM can be effective, so we plan to explore them further and eventually try stacking multiple models using data from October 28‑29 as training samples.

The competition became increasingly intense, and a two‑day stagnation felt like a Waterloo. We recognize that ideas often outpace time, and the challenge provides a valuable user experience as we race against the clock to test and implement our strategies.

We wish all participants success in discovering high‑impact strategies and achieving excellent results.

For more details, visit the official competition website: http://algo.tpai.qq.com. Follow the official algorithm WeChat account TSA-Contest for additional resources and gifts.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Memory optimization XGBoost Pandas algorithm competition

Written by

Tencent Advertising Technology

Official hub of Tencent Advertising Technology, sharing the team's latest cutting-edge achievements and advertising technology applications.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.