Artificial Intelligence 20 min read

How Alibaba’s Uni‑Marketing Boosted Brand Conversions with AI‑Driven Audience Selection

This article details Alibaba's Uni‑Marketing case study where a brand‑targeted audience selection algorithm, built on big‑data and AI techniques, improved the O→IPL deepening rate by 47% during the New‑Year Festival, outlining the technical pipeline, models, evaluation metrics, challenges, and future directions.

Alibaba Cloud Developer

Oct 10, 2018

1. Background

Uni‑Marketing is a brand‑wide digital marketing strategy built on Alibaba’s ecosystem, aiming at full‑chain, full‑media, full‑data, and full‑channel brand big‑data marketing. Traditional marketing suffers from unquantifiable effects, which Alibaba’s closed‑loop data can solve.

The “Seal” project for the New‑Year Festival used big‑data and algorithms to analyze brand A’s target audience, building an audience‑selection model that improved the O→IPL deepening rate by 47% compared with rule‑based selection.

2. Terminology

Brand‑consumer relationship: Opportunity, Awareness, Interest, Purchase, Loyalty.

Audience deepening rate (O→I): conversion rate from opportunity to interest.

Brand Data Bank: Management and enrichment of brand consumer data assets across three dimensions: integration, analysis, activation.

Brand Strategy Center: Provides market overview, segmentation, competition analysis, consumer insights, audience expansion and selection, supporting new product launch, category growth, and brand upgrade scenarios.

3. Project Goal

Generate a specific‑scale target audience for brand A’s New‑Year Festival campaign, then convert the identified opportunity or awareness audience into interest and purchase audiences.

4. Industry Solutions

Typical programmatic audience targeting methods include:

Tag diffusion

Tag‑based collaborative filtering

Social‑relationship diffusion

Clustering‑based diffusion (e.g., BIRCH, CURE)

Target‑audience classification (PU learning)

5. Technical Solution

Two schemes were considered: “seed audience clustering + cluster diffusion” and “multi‑direction diffusion + audience classification”. The latter was implemented.

Overall workflow (see image):

5.1 Multi‑direction diffusion

Six directions were explored, each extracting effective features and applying white‑box rule filtering or black‑box model prediction.

5.1.1 Interest‑preference direction

Using TGI and TA concentration metrics to select brand‑related features and set thresholds for white‑box diffusion.

5.1.2 Related‑category direction

1) Main‑category analysis based on product count and sales. 2) Related‑brand analysis using a brand‑user matrix and Jaccard similarity (see image). 3) Related‑category analysis via association‑rule mining and confidence filtering (see image).

5.1.3 Competitor‑audience direction

Identify top‑10 competing brands, analyze audience flow, and train a conversion model using features such as AIPL state, refunds, ratings, etc.

5.1.4 Search‑audience direction

Search‑keyword analysis combines competition level and brand advantage; formulas use keyword‑driven transaction entropy (E) and revenue (R) (see images).

5.1.5 Lost‑audience direction

Recall users who left IPL or A states within recent periods.

5.1.6 Peer‑audience direction

Compute user‑user similarity from vector representations (category vectors, brand vectors, graph embeddings).

5.1.7 Audience aggregation

Deduplicate the six direction audiences to form the input of the audience‑selection model.

5.2 Target‑audience selection model

Without historical campaign data, positive samples are brand‑purchased users (augmented for new or small brands) and negative samples are random users from other brands.

5.2.1 Evaluation metrics

Traditional metrics (AUC, precision) are insufficient; a new metric PredictTA TopN Precision measures the proportion of true target users in the top‑N selected audience. NewTA TopN Recall measures the proportion of newly added target users captured.

Results show PredictTA TopN Precision increases as TopN decreases and is stable across models.

5.2.2 Model training

Features include numeric discretization, categorical filtering, multi‑value handling, one‑hot encoding, sparse‑feature embedding, and feature selection based on importance.

Algorithms tested:

Logistic Regression (baseline, interpretable)

Random Forest (lower precision and AUC)

PS‑SMART (GBDT‑based, best performance, hyper‑parameter tuned)

5.2.3 Model prediction

Score the aggregated audience from 5.1, filter out scores < 0.5, and remove existing IPL users to obtain the final target set.

5.3 New‑Year‑goods audience model

Because the campaign coincides with the Spring Festival, a dedicated model was built using last year’s behavior data around the holiday.

5.3.1 Holiday‑related categories

Extract categories linked to “New‑Year goods” search terms and compute relevance via I2I algorithm.

5.3.2 Holiday‑audience features

Include demographic attributes, brand‑related preferences, main‑category behavior, and holiday‑category behavior.

5.3.3 Modeling

Sample users who behaved on relevant categories one month before the campaign; positives are those who converted to PL during the festival, negatives are random. Train with PS‑SMART and predict as in 5.2.

5.4 Model fusion

Combine daily and holiday models based on their PredictTA TopN Precision to produce the final audience pool for DMP upload.

5.5 Campaign tracking

In brand A’s case, the algorithm‑selected holiday audience improved the O→IPL deepening rate by 47 % and showed the highest conversion when mixed with the daily model.

6. Challenges and Mitigations

6.1 Short project timeline

Prioritized alignment of model and business goals; channel optimization was not addressed.

6.2 No historical feedback

Lacked direct negative samples; relied on random sampling, leaving room for improvement.

6.3 Missing historical attribute features

Only recent attributes were available for the holiday model.

6.4 Sparse feature noise

Used TGI and TA concentration to filter high‑quality sparse features.

6.5 Effective evaluation

Designed PredictTA TopN Precision to assess model performance on diffusion audiences.

7. Conclusion and Outlook

The audience‑diffusion + selection pipeline, the daily and holiday models, and the PredictTA TopN Precision metric proved effective. Future work includes feedback‑driven sample optimization, richer historical feature storage, deep‑learning‑based sparse‑feature densification, and multi‑task learning for lifestyle embeddings.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data machine learning Digital Marketing brand optimization

Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.