Unlocking Massive-Scale User Behavior Analysis: From Funnels to Intelligent Links
This talk explores how to conduct user behavior analysis on massive data sets, compares existing analytics tools, and presents Alibaba Dataworks' end‑to‑end solution—including funnel and link visualizations, a big‑data processing architecture, and future intelligent link capabilities—to uncover and resolve user‑experience issues efficiently.
What Is User Behavior Analysis?
User behavior analysis is a process of discovering problems, locating them, and validating solutions by examining user actions, such as funnel drop‑offs, to understand why users leave and where they go.
Industry Status
Common tools include:
Google Analytics – provides conversion funnels and Sankey path diagrams.
Growing IO – offers conversion funnels and simple path graphs.
Sensors Data – also supplies conversion funnels and Sankey path visualizations.
These tools can identify where users drop off but lack deep analysis of the underlying reasons.
Alibaba Dataworks Solution
Our platform adds two crucial metrics:
A funnel chart that shows user reach at each step.
A bar chart that displays operation latency and frequency, highlighting issues not visible in conversion metrics alone. We also provide a link graph that visualizes user paths and supports drill‑down analysis to pinpoint where users are lost.
Technical Deep‑Dive
Massive Data Analysis – Simple scenarios with a few dozen records can be handled with loops in Node.js or Java, but real‑world user actions generate billions of events daily. To compute conversion rates, step lengths, durations, variances, and other metrics efficiently, we rely on a big‑data stack (Hadoop/MaxCompute) that supports both real‑time and batch processing. Our architecture consists of a data collection layer, a persistence layer for preprocessing and storage, and an application layer that runs analytics using open‑source engines or Alibaba Cloud MaxCompute, producing reports that are stored in MySQL/RDS for downstream consumption. Link Matching – We match defined primary paths (e.g., a‑b‑c) with actual user paths using a UDTF that transforms multi‑row data into single rows, filters unrelated nodes, and assigns numeric identifiers to enable precise matching and metric calculation (step length, duration). User Behavior Tree – By merging nodes at the same depth across multiple user sessions, we build a tree structure that aggregates counts, enabling deeper insight into common navigation patterns. Visualization Techniques – Our link graph is a complex hierarchical‑compound graph. We address challenges such as cycle removal (by reversing edges), long‑edge cutting (introducing virtual nodes), and layer‑wise node ordering using heuristic algorithms to minimize edge crossings and improve readability.
Future Outlook
We aim to evolve toward intelligent link analysis, allowing analysts to specify start/end points and automatically surface the most common paths, assess alignment with expected flows, and prioritize high‑traffic routes for UI/UX improvements.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
