Artificial Intelligence 19 min read

Data Intelligence in the Used‑Car Business: User Traffic Prediction and Identification (Part 1)

This article details how the 58 Group applied data‑driven methods—user segmentation, interest description, clustering, and predictive modeling—to forecast and identify traffic in the used‑car scenario, illustrating the end‑to‑end pipeline, experimental results, and practical impact on downstream business processes.

58 Tech
58 Tech
58 Tech
Data Intelligence in the Used‑Car Business: User Traffic Prediction and Identification (Part 1)

Background – Effective AI‑enabled scenarios require deep business understanding; the used‑car platform splits user behavior into four stages (traffic acquisition, retrieval, click, call) and focuses on the first stage for traffic prediction.

Traffic Acquisition – User Segmentation – Five user identities (personal, dealer, platform‑like, crawler, abnormal) are defined, statistical analysis and clustering (K‑Means, GMM, DBSCAN) are used to filter anomalies, and a probabilistic model (LR, XGBoost, FM) estimates identity probabilities, which feed downstream strategies such as phone‑number binding and resource allocation.

Interest Description – Instead of full user profiles, the system builds interest tags (e.g., car series + price range) from behavior logs, computes an item‑based collaborative‑filtering (ItemCF) similarity matrix using Jaccard distance, stores top‑N tags in Redis, and uses them as recall conditions for both offline and real‑time recommendation.

Data Pipeline – The workflow consists of two phases: (1) statistical analysis & rule definition on cleaned historical logs (DS → DW → DM → DA layers) and (2) feature selection, clustering, and supervised learning; PCA is applied for dimensionality reduction, and model AUCs reach ~0.68.

Model‑Driven Interventions – The call‑prediction (cvr) model guides number‑recycling decisions, improving number‑resource usage by ~4 % and reducing cross‑number rates; recall‑rate and precision experiments show a 4.9‑6 % uplift in CTR when the model is applied.

Future Directions – Plans include expanding feature combinations, enriching tag semantics (vehicle attributes, dealer credit), multi‑matrix collaborative recall, and continuous evaluation of similarity‑matrix performance.

Conclusion – Accurate traffic estimation and interest description are foundational for downstream information recall, and the presented pipeline demonstrates a practical AI solution for the used‑car business.

machine learninguser segmentationrecommendationData Intelligencetraffic predictionused car
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.