Competition Solution Overview: Data Analysis, Rule‑Based and Neural Network Models for Advertising Prediction
The article details a contestant's end‑to‑end approach for an advertising competition, covering data analysis, rule‑based preprocessing, a three‑layer neural network architecture, model‑rule ensemble weighting, self‑correction strategies for the B phase, and final model‑only solutions that achieved top scores.
With the competition halfway through, the final B round is underway, and the contestant "Snail from Little People Kingdom" shares a solution emphasizing strong self‑correction and efficient plan modification.
The author introduces themselves, their team ID, and outlines thoughts across the four competition stages, acknowledging personal shortcomings and inviting discussion.
During initial data analysis, they found the dataset unlike previous contests, making training‑set extraction difficult; they selected ads with effective status and unchanged the next day, yielding about 20,000 ads, many with zero exposure.
Realizing a simple rule model could not surpass basic benchmarks, they adopted a strategy where their own rule served as the primary model and a machine‑learning model acted as a supplement.
Rule explanation: Ads created before the test period are "old"; those created within are "new." For old ads, they computed exponential smoothing of the past ten days' exposures as predictions. For new ads, they used the median exposure of the previous day for the same account and the same product, weighted 0.7 and 0.3 respectively, achieving an online score around 87.5.
Model explanation: Only new ads are predicted, so the training set is trimmed to ads whose creation time is close to the prediction window. The neural network has three layers: the first layer consists of parallel CNN and LSTM branches; the second layer combines parallel max‑pooling and attention mechanisms; the third layer is a fully connected layer that compresses the output to a single value. The model output is linearly combined with the rule output (weights 0.7 and 0.3), reaching an online score of about 87.8.
In the B phase, the initial score dropped to around 83, prompting analysis of Test A and Test B datasets. They discovered many old ads in Test B lacked corresponding operation data, indicating a new ad type not captured by the original old/new split.
They re‑classified ads into three categories: old ads with historical data, old ads without historical data, and new ads. The first and third types follow the original A‑phase approach, while old ads without history are assigned a constant score of 4, improving the online rank to seventh place.
During the semi‑final A phase, they abandoned the rule‑based approach entirely, relying solely on models. Three models were used: LightGBM (88.85), a neural network (88.30), and LightGBM + Logistic Regression (88.89). An ensemble of LightGBM and the neural network achieved 88.97.
Neural network details: Features are divided into three groups: low‑cardinality discrete features (embedded), high‑cardinality discrete features (converted to frequencies), and continuous features. These are concatenated and fed into a network comprising multiple fully connected layers, a B‑Interaction module (from NFM), and a Cross Network (from Deep & Cross Network). The combined representation passes through a final fully connected layer to produce predictions.
The author concludes by wishing all participants enjoyment and success in the competition.
Test A
Test B
Total ad count
1954
3750
Ads present in exposure data
1361
1384
Ads created before 3.20
1606
3116
Ads present in operation data
1596
1616
Tencent Advertising Technology
Official hub of Tencent Advertising Technology, sharing the team's latest cutting-edge achievements and advertising technology applications.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.