How Alibaba’s Uni‑Marketing Boosted Brand Conversions with AI‑Driven Audience Selection
This article details Alibaba's Uni‑Marketing case study where a brand‑targeted audience selection algorithm, built on big‑data and AI techniques, improved the O→IPL deepening rate by 47% during the New‑Year Festival, outlining the technical pipeline, models, evaluation metrics, challenges, and future directions.
1. Background
Uni‑Marketing is a brand‑wide digital marketing strategy built on Alibaba’s ecosystem, aiming at full‑chain, full‑media, full‑data, and full‑channel brand big‑data marketing. Traditional marketing suffers from unquantifiable effects, which Alibaba’s closed‑loop data can solve.
The “Seal” project for the New‑Year Festival used big‑data and algorithms to analyze brand A’s target audience, building an audience‑selection model that improved the O→IPL deepening rate by 47% compared with rule‑based selection.
2. Terminology
Brand‑consumer relationship: Opportunity, Awareness, Interest, Purchase, Loyalty.
Audience deepening rate (O→I): conversion rate from opportunity to interest.
Brand Data Bank: Management and enrichment of brand consumer data assets across three dimensions: integration, analysis, activation.
Brand Strategy Center: Provides market overview, segmentation, competition analysis, consumer insights, audience expansion and selection, supporting new product launch, category growth, and brand upgrade scenarios.
3. Project Goal
Generate a specific‑scale target audience for brand A’s New‑Year Festival campaign, then convert the identified opportunity or awareness audience into interest and purchase audiences.
4. Industry Solutions
Typical programmatic audience targeting methods include:
Tag diffusion
Tag‑based collaborative filtering
Social‑relationship diffusion
Clustering‑based diffusion (e.g., BIRCH, CURE)
Target‑audience classification (PU learning)
5. Technical Solution
Two schemes were considered: “seed audience clustering + cluster diffusion” and “multi‑direction diffusion + audience classification”. The latter was implemented.
Overall workflow (see image):
5.1 Multi‑direction diffusion
Six directions were explored, each extracting effective features and applying white‑box rule filtering or black‑box model prediction.
5.1.1 Interest‑preference direction
Using TGI and TA concentration metrics to select brand‑related features and set thresholds for white‑box diffusion.
5.1.2 Related‑category direction
1) Main‑category analysis based on product count and sales. 2) Related‑brand analysis using a brand‑user matrix and Jaccard similarity (see image). 3) Related‑category analysis via association‑rule mining and confidence filtering (see image).
5.1.3 Competitor‑audience direction
Identify top‑10 competing brands, analyze audience flow, and train a conversion model using features such as AIPL state, refunds, ratings, etc.
5.1.4 Search‑audience direction
Search‑keyword analysis combines competition level and brand advantage; formulas use keyword‑driven transaction entropy (E) and revenue (R) (see images).
5.1.5 Lost‑audience direction
Recall users who left IPL or A states within recent periods.
5.1.6 Peer‑audience direction
Compute user‑user similarity from vector representations (category vectors, brand vectors, graph embeddings).
5.1.7 Audience aggregation
Deduplicate the six direction audiences to form the input of the audience‑selection model.
5.2 Target‑audience selection model
Without historical campaign data, positive samples are brand‑purchased users (augmented for new or small brands) and negative samples are random users from other brands.
5.2.1 Evaluation metrics
Traditional metrics (AUC, precision) are insufficient; a new metric PredictTA TopN Precision measures the proportion of true target users in the top‑N selected audience. NewTA TopN Recall measures the proportion of newly added target users captured.
Results show PredictTA TopN Precision increases as TopN decreases and is stable across models.
5.2.2 Model training
Features include numeric discretization, categorical filtering, multi‑value handling, one‑hot encoding, sparse‑feature embedding, and feature selection based on importance.
Algorithms tested:
Logistic Regression (baseline, interpretable)
Random Forest (lower precision and AUC)
PS‑SMART (GBDT‑based, best performance, hyper‑parameter tuned)
5.2.3 Model prediction
Score the aggregated audience from 5.1, filter out scores < 0.5, and remove existing IPL users to obtain the final target set.
5.3 New‑Year‑goods audience model
Because the campaign coincides with the Spring Festival, a dedicated model was built using last year’s behavior data around the holiday.
5.3.1 Holiday‑related categories
Extract categories linked to “New‑Year goods” search terms and compute relevance via I2I algorithm.
5.3.2 Holiday‑audience features
Include demographic attributes, brand‑related preferences, main‑category behavior, and holiday‑category behavior.
5.3.3 Modeling
Sample users who behaved on relevant categories one month before the campaign; positives are those who converted to PL during the festival, negatives are random. Train with PS‑SMART and predict as in 5.2.
5.4 Model fusion
Combine daily and holiday models based on their PredictTA TopN Precision to produce the final audience pool for DMP upload.
5.5 Campaign tracking
In brand A’s case, the algorithm‑selected holiday audience improved the O→IPL deepening rate by 47 % and showed the highest conversion when mixed with the daily model.
6. Challenges and Mitigations
6.1 Short project timeline
Prioritized alignment of model and business goals; channel optimization was not addressed.
6.2 No historical feedback
Lacked direct negative samples; relied on random sampling, leaving room for improvement.
6.3 Missing historical attribute features
Only recent attributes were available for the holiday model.
6.4 Sparse feature noise
Used TGI and TA concentration to filter high‑quality sparse features.
6.5 Effective evaluation
Designed PredictTA TopN Precision to assess model performance on diffusion audiences.
7. Conclusion and Outlook
The audience‑diffusion + selection pipeline, the daily and holiday models, and the PredictTA TopN Precision metric proved effective. Future work includes feedback‑driven sample optimization, richer historical feature storage, deep‑learning‑based sparse‑feature densification, and multi‑task learning for lifestyle embeddings.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
