How to Build a Scalable Tag/Profile System for Marketing Automation
This article shares engineering practices for constructing a tag‑profile system, covering core concepts, minimal architecture, technology selection, key modules such as estimation, selection, deployment, and validation, and offers design details and implementation tips for large‑scale marketing scenarios.
Introduction
Tag or user profile systems are essential for internet companies, yet most discussions focus on data algorithms or product design rather than engineering implementation. This article presents practical engineering experience from Alibaba's local life tag‑profile system.
Basic Concepts
Tag : an abstract classification of a characteristic of a group or object, e.g., gender or purchase amount.
Profile : a collection of descriptive information about a person or object, used for personalized recommendations.
Group (Audience) : a set of people or objects, defined by tag combinations or manually.
Selection : the process of filtering a specific set based on tag attributes.
Building a Minimal Tag/Profile System
The minimal system follows a marketing scenario of issuing coupons, which involves four steps: estimate target audience size, select the audience, deliver coupons, and validate redemption.
Business Requirements : user attribute tables serve as the source of tags, which can be atomic or composite, real‑time or offline.
Technical Requirements :
Real‑time estimation interface
Offline selection table (using ODPS)
Audience file generation (stored in OSS)
Validation interface
Technology Selection
Estimation Interface
The main challenge is executing complex SQL within 10 seconds. Example queries:
SELECT count(distinct user_id) FROM table_1 WHERE location = 'Shanghai' AND age > 20;When adding another condition (e.g., likes tea) that resides in a different table:
SELECT count(distinct user_id) FROM (
SELECT table_1.user_id
FROM table_1
LEFT JOIN table_2 ON (table_1.user_id = table_2.user_id)
WHERE table_1.location = 'Shanghai' AND table_1.age > 20 AND table_2.is_like_tea = 1
) AS mt1;For large datasets (>100 million rows) and multi‑tag joins, an analytical database such as Alibaba ADB, Hologres, or Elasticsearch is recommended.
Selection Engine
The selection engine generates group tables and files using a compute platform. Alibaba Cloud MaxCompute is used as the offline engine, with ODPS as the data warehouse, DataStudio for development, and OSS for file storage.
Validation Interface
The validation interface must handle high QPS, large data volumes, and low latency (ms). KV‑type storage such as Redis, HBase, or Alibaba Lindorm is suggested.
Complete Solution
The workflow links tag sources, group selection rules, estimation query interface, ODPS result tables, OSS files for delivery, and validation interfaces.
Core Module Design
Selection Scheduling
Handles thousands of daily group selection tasks via a dedicated module.
The engine consists of a task pool, scheduler, executor, and dependency checker.
Dependency Checking
Four main dependency types: tag base table, existing group files/ODPS tables, external groups, and combined dependencies.
ID Mapping
ID mapping converts IDs between entities (e.g., product to merchant) to support downstream operations.
Conclusion
Based on Alibaba's experience, the article outlines engineering solutions for tag/profile systems, covering architecture, module design, and technology choices. It also mentions future extensions such as production management, tenant isolation, monitoring, group insight analysis, and effect evaluation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
