AuctionNet: A Novel Benchmark for Decision-Making in Large-Scale Games
AuctionNet is a newly introduced benchmark that recreates a massive, realistic online advertising auction environment using latent diffusion‑generated traffic data, provides an 80 GB dataset of 5 × 10⁸ logs from 48 bidding agents, and offers baseline evaluations—including an Online LP that outperforms others—supporting thousands of fair NeurIPS 2024 competition submissions and open‑source tools for large‑scale game decision‑making research.
Decision intelligence in large‑scale game environments is a crucial research direction in artificial intelligence, yet progress is hampered by the lack of realistic, large‑scale game environments and datasets. To address this, we propose AuctionNet, a benchmark derived from the online advertising industry that provides a massive ad‑auction environment, a pre‑generated dataset, and baseline algorithm evaluations.
AuctionNet’s environment simulates real‑world ad auctions using a deep generative model to produce traffic data, reducing the gap between simulation and reality while protecting sensitive information. The dataset contains logs of 48 competing bidding agents, totaling 5 × 10⁸ records (≈80 GB).
The benchmark was used in the NeurIPS 2024 Datasets and Benchmarks Track (Spotlight), supporting nearly 10 000 fair evaluations for 1 500 teams.
We model the problem as a Partially Observable Stochastic Game (POSG) with agents observing budgets, traffic features, and advertiser attributes, and bidding as a product of a coefficient and traffic value. The auction follows a Generalized Second‑Price (GSP) mechanism, optionally supporting multiple ad slots.
The environment consists of three modules: a traffic‑generation module based on Latent Diffusion Models (LDM), a bidding module that hosts diverse autonomous bidding agents, and an auction module that implements GSP and other customizable rules.
We validate the generated data by comparing it with 100 k real ad logs using PCA visualizations and distribution analyses of user demographics and consumption behavior, showing strong similarity.
Baseline algorithms (PID Controller, Online LP, IQL, BC, Decision Transformer) are evaluated on both basic and CPA‑constrained tasks; Online LP achieves the best performance.
AuctionNet was deployed in the NeurIPS 2024 “Auto‑Bidding in Large‑Scale Auctions” competition, providing accurate, fair evaluations for thousands of submissions.
The benchmark code is open‑sourced to accelerate research in large‑scale game decision making, reinforcement learning, generative modeling, and operations research.
Alimama Tech
Official Alimama tech channel, showcasing all of Alimama's technical innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.