Unifying Global User Portrait Data to Power Business Growth
The project consolidates fragmented user data across departments into a unified OneID system, builds a comprehensive tagging and feature platform, and leverages real‑time and offline signals for algorithmic push, custom models, and crowd selection, enabling precise, data‑driven growth initiatives.
Project Background
The company has accumulated massive user data that is scattered across multiple departments, making it difficult to obtain a unified view of each user. The goal is to integrate these fragmented, unordered data sources into a single, coherent user portrait that can unlock the full value of the data.
OneID Unification
Users often have multiple identifiers (e.g., mid, cid, idfa) due to device changes, reinstallations, or switching between guest and logged‑in states. The OneID approach normalizes these identifiers by selecting member_id as the primary key and using graph computation plus ID Mapping to associate all other IDs. The result is a one‑to‑one mapping where a single zhihu_id uniquely represents a user, enabling accurate lifecycle tracking.
Algorithm Push Experiments
With the OneID data warehouse, multiple device and account IDs are merged, solving identity fragmentation. Experiments aim to aggregate cross‑channel behavior under the same ID and use it for push recommendations. Results show significant improvements for low‑activity users, with higher queue consumption efficiency.
Real‑Time Push Experiments
When a user is active on the M‑site or mini‑program and not yet an ODAU, the system pushes personalized content in real time, improving recall.
For non‑logged‑in or guest users, offline portraits are used to infer preferences, enhancing content matching and avoiding ineffective touches.
A layered recall system further boosts CTR and device coverage.
Feature Layer
ID Mapping connects user identifiers across growth, alliance, and search‑push business lines, enabling cross‑domain feature fusion. Examples include:
Search‑push can ingest alliance APP behavior data to enrich interest features.
Alliance can use growth conversion tags to optimize traffic distribution.
Growth can leverage search‑push behavior sequences for more precise targeting.
Real‑Time Signal Construction
App usage signals (e.g., number of days an app was used in the past 7 days) are captured and synced via ID Mapping. Example: a user playing the "阴阳师" game is identified as a high‑frequency gamer, informing game‑related recommendation ranking.
User Behavior Data
Behavioral metrics such as clicks, views, purchases, and session duration are used to build short‑term preference portraits (lastN). Applications include negative‑feedback detection, interest‑drift capture, and behavior‑sequence modeling for Transformer or RNN inputs.
Online Learning Labels (OLR)
OLR generates dynamic prediction labels in real time based on the latest user actions, providing high‑frequency, adaptive signals. Integrated via ID Mapping, OLR supports:
Real‑time intent prediction (e.g., shopping intent, content consumption tendency).
Churn probability prediction, triggering timely retention actions.
Complementary use with offline static tags for richer profiling.
Tagging System
A systematic tag hierarchy covers consumption, interaction, creation, recommendation, search, and more. Tags are categorized into base, membership, growth, and education dimensions, with lifecycle maintenance (source validation, logic updates, distribution monitoring) to ensure tag accuracy.
Custom Tags & Models
Specialized tags support business‑specific needs, such as:
Education Potential Model : Constructs positive/negative samples from users active in AI‑skill or side‑job categories over the past 180 days, using demographic, behavior, and domain‑specific features.
Membership Card Upgrade Model : Predicts preferred membership duration (monthly/seasonal/yearly) to match users with optimal plans, supporting ARPU growth, incentive‑ad optimization, and coin‑cashback strategies.
User Portrait Platform
The platform allows users to combine tags to create crowd packages, preview estimated sizes, validate precision, and push directly to marketing systems. Features include flexible logical operators (AND/OR/NOT), real‑time feedback, visual previews, high‑efficiency reuse, and seamless system integration.
Crowd Selection & Insight
Through the platform, users can perform:
Tag‑based crowd selection.
Preview of crowd demographics.
Merge, intersect, or differentiate crowd packages.
Export detailed user lists and visual insights.
Insights are transformed into actionable business decisions, such as targeted pushes, personalized messaging, and strategic optimizations.
Growth Applications
Two analytical reports illustrate the platform’s impact:
New‑User Analysis : Evaluates acquisition channel quality within 7 days using portrait dimensions (city level, age, education, occupation) and business metrics (interaction, LTV, CAC, ROI). Scoring combines standardized data, weighted dimensions, and simple averages to guide channel optimization.
Overall User Analysis : Validates that a 100k sample mirrors the full user base across key dimensions, confirming data reliability for macro‑level insights and cross‑dimensional analyses.
Conclusion & Outlook
The user portrait platform turns scattered data into reusable tag assets. Its success hinges on continuous business‑tech collaboration: business defines scenarios, technology iterates rapidly, and both co‑design and validate the tag ecosystem to keep the platform aligned with real‑world needs.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Zhihu Tech Column
Sharing Zhihu tech posts and exploring community technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
