User Portrait Tagging: Construction, Feature Processing, and Evaluation
This article explains how to build user portrait tags—from basic attribute tags to business and strategy tags—covers methods for data collection, anomaly handling, time decay, smoothing, and evaluates tag quality using cohesion, stability, and AUC-related metrics to support data‑driven product decisions.
Introduction As enterprises deepen digital transformation, understanding users becomes critical; this talk dissects user portrait tags, their construction, and their role across varied business scenarios, emphasizing accurate labeling through anomaly handling, time decay, and data distinguishability.
1. Basic Attribute Portrait Tags These tags describe inherent user properties (e.g., gender, age, OS, city) and are sourced via user input, event tracking, model prediction, or third‑party data. Applications include daily analysis for attribute distribution and as features for complex modeling such as ranking or behavior prediction.
2. Business‑Oriented Portrait Tags Tags tightly linked to KPI goals, classified as strongly or weakly associated. Construction methods involve KPI‑driven segmentation or behavior‑based composite calculations, enabling targeted analysis of user groups for operational monitoring and differentiated strategies.
3. Strategy‑Oriented Portrait Tags Designed for specific interventions (e.g., red‑packet incentives), these tags help identify high‑gain, repurchase, or future‑behavior cohorts, often evaluated with uplift models and ROI predictions.
Feature Processing for Portrait Tags Key steps include data cleaning (outlier detection with box plots and AVF, outlier/empty value filling using caps, floors, or statistical extremes), time decay weighting (RFM‑based weighting to give recent actions higher influence), and smoothing (log transformation to mitigate long‑tail effects).
Portrait Result Evaluation Long‑term assessment uses cohesion (Silhouette Coefficient) and stability (Coefficient of Variation) metrics; high cohesion indicates similar users within a segment, while low variation signals stable segment performance over time.
Q&A Highlights Answers cover cohesion calculation for activity levels, segmentation thresholds, computational cost of time decay, and practical usage scenarios for tag evaluation.
Overall, the content provides a comprehensive framework for constructing, processing, and evaluating user portrait tags to enhance product and strategy effectiveness.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.