User Behavior Analysis: Objectives, Implementation Methods, and Product Line Applications
The article explains the goals of user behavior analysis, details two implementation methods for single-step and multi-step path conversions using log preprocessing, adjacency matrices, and KMP pattern matching, and illustrates these techniques with a Baidu Doctor product line case study.
User behavior analysis is a method to obtain group characteristics of users during product usage, such as main browsing paths and average page dwell times, helping product teams understand overall traffic flow, identify high‑dropout pages, and set up continuous monitoring of key conversion metrics.
The implementation is divided into two scenarios: (1) path length equal to 1, where conversion is measured between adjacent pages, and (2) path length greater than 1, where conversion across multiple pages is analyzed.
1. Path length = 1
Data preprocessing
(a) Use the product line's access logs without adding extra tracking code. Each raw log line is transformed into three columns:
timestamp\tbehavior_cluster_id (e.g., UserID)\taccess_info (e.g., URL, user action)(b) Provide a page‑node configuration file with two columns per line (URL and page name) to facilitate result interpretation:
access_info (URL)\tnode_info (page name)Modeling
(a) After preprocessing, group logs by user ID and sort each group by timestamp.
(b) Map each user's ordered access sequence to an adjacency matrix representing page nodes.
(c) For each directed edge in the matrix, store the transition relationship and dwell time, thus preserving all users' navigation sequences.
Model statistics
Assuming page A maps to node p, page B to node q, and there are n pages, the (n+1)‑order adjacency matrix allows calculation of page‑to‑page transition rates and exit rates. The formulas are illustrated in the following images:
Exit rate from page A:
Demo example
User A, B, and C navigation sequences after uniform mapping:
User A: 0 2 3 1 0
User B: 0 3 1 0
User C: 0 2 3 2 0
The modeling process and resulting statistics are shown below:
2. Path length > 1
For longer paths, the analysis generates multiple pattern strings (e.g., page1→page2 and page1→page2→page3) and applies the KMP algorithm to detect these patterns in each user's ordered access sequence.
The count of each pattern represents the traffic for that level of pages, and the aggregated results are visualized as a funnel‑shaped conversion rate diagram:
3. Product line application example
In the Baidu Doctor product line, this methodology is used not only to obtain an overall view of online user traffic conversion but also to evaluate the effect of small‑scale AB tests. By comparing conversion rates and key page exit rates between test and control groups, the analysis helps decide whether a feature can be rolled out to full traffic.
Note: The article concludes with a subscription prompt and keyword‑reply instructions for additional content, which are promotional in nature.
Baidu Intelligent Testing
Welcome to follow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.