Fundamentals 12 min read

User Portrait Tagging: Construction, Feature Processing, and Evaluation

This article explains how to build user portrait tags—from basic attribute tags to business and strategy tags—covers methods for data collection, anomaly handling, time decay, smoothing, and evaluates tag quality using cohesion, stability, and AUC-related metrics to support data‑driven product decisions.

DataFunTalk
DataFunTalk
DataFunTalk
User Portrait Tagging: Construction, Feature Processing, and Evaluation

Introduction As enterprises deepen digital transformation, understanding users becomes critical; this talk dissects user portrait tags, their construction, and their role across varied business scenarios, emphasizing accurate labeling through anomaly handling, time decay, and data distinguishability.

1. Basic Attribute Portrait Tags These tags describe inherent user properties (e.g., gender, age, OS, city) and are sourced via user input, event tracking, model prediction, or third‑party data. Applications include daily analysis for attribute distribution and as features for complex modeling such as ranking or behavior prediction.

2. Business‑Oriented Portrait Tags Tags tightly linked to KPI goals, classified as strongly or weakly associated. Construction methods involve KPI‑driven segmentation or behavior‑based composite calculations, enabling targeted analysis of user groups for operational monitoring and differentiated strategies.

3. Strategy‑Oriented Portrait Tags Designed for specific interventions (e.g., red‑packet incentives), these tags help identify high‑gain, repurchase, or future‑behavior cohorts, often evaluated with uplift models and ROI predictions.

Feature Processing for Portrait Tags Key steps include data cleaning (outlier detection with box plots and AVF, outlier/empty value filling using caps, floors, or statistical extremes), time decay weighting (RFM‑based weighting to give recent actions higher influence), and smoothing (log transformation to mitigate long‑tail effects).

Portrait Result Evaluation Long‑term assessment uses cohesion (Silhouette Coefficient) and stability (Coefficient of Variation) metrics; high cohesion indicates similar users within a segment, while low variation signals stable segment performance over time.

Q&A Highlights Answers cover cohesion calculation for activity levels, segmentation thresholds, computational cost of time decay, and practical usage scenarios for tag evaluation.

Overall, the content provides a comprehensive framework for constructing, processing, and evaluating user portrait tags to enhance product and strategy effectiveness.

Feature Engineeringuser profilingevaluation metricsdata sciencetag construction
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.