How Youzan’s BI Platform Turns Massive Data into Interactive Visual Insights
This article explains the design, features, and technical implementation of Youzan’s BI platform, covering its target users, visualization workflow, supported chart types, filtering, permissions, drill‑down, calculated fields, SQL generation logic, and future development directions.
Overview
Youzan is a SaaS and big‑data company that needed an efficient way to extract value from massive datasets and provide visual analysis and reporting. The Youzan BI platform was created to meet this need, offering fast, convenient data analysis and visualization for various business scenarios. It currently serves over 700 UV, 30 k+ PV, and hosts more than 5 k reports.
Target Users and Application Scenarios
Report developers – typically BI analysts and data engineers.
Report consumers – operations, service, product teams and anyone needing visual data insights.
Other internal applications – products that require data display or analysis.
Typical usage scenarios are illustrated in the accompanying diagram.
Data Visualization
Terminology
Dimension : the perspective for observing data, usually categorical (e.g., date, store). It can be a row dimension (similar to Excel row headers) or a column dimension (multiple row headers above numeric columns).
Metric : the numeric value to be aggregated, such as sales amount or profit.
Quick Start
Step 1 – Add Dataset : Add dataset → select connection account → custom SQL → preview data → submit. The dataset is a logical SQL that serves as the source table for subsequent charts.
Step 2 – Add Report : Create chart → choose the previously created dataset → confirm to enter the report editor.
Step 3 – Add Permissions : Go to the dashboard → click the gear icon → add permissions, or manage permissions via the resource manager. This ensures only authorized users can operate on the data.
Tips: mark field date types in the upper‑right gear to enable time‑grain operations; add calculated fields for secondary calculations; use the report copy function to duplicate similar reports.
Chart Types
The platform supports more than ten chart types covering most daily visualization needs, including KPI cards, line charts, dual‑axis charts, area and stacked‑area charts, bar and stacked‑bar charts, pie charts, funnel charts, Sankey diagrams, radar charts, and word clouds. Each type is suited to specific scenarios, as shown in the example images.
Filtering and Sorting
Users can filter fields or metrics using operators such as >, >=, <, <=, =, <>, IS NULL, BETWEEN, LIKE, IN, etc., and can filter by time dimensions (year, month, quarter, week) after marking a field as a date type. Sorting can be applied to rows or metrics in ascending or descending order. Pagination is handled differently per database (e.g., LIMIT for MySQL, ROW_NUMBER() for Presto).
Calculated Fields
Users may write custom SQL snippets to generate virtual fields. Non‑aggregate expressions are treated as regular fields, while aggregate expressions are used directly without an extra aggregation step. When a calculated field coexists with a column dimension, the platform wraps the aggregate with a try() function and adds conditional IF clauses to handle each column‑dimension value.
Drill‑Down and Linking
Drill‑down allows users to view finer‑granularity data (e.g., from province‑level to city‑level) by applying the same filter logic across linked reports. Linking ensures that when a filter changes in one report, related reports update automatically.
Row/Column Permissions
Column permissions hide specific fields from certain users or groups, while row permissions restrict visible rows based on conditions. Two modes are supported: condition mode (predefined expressions) and free mode, where users write SQL conditions (e.g., [dep] IN {db.table.businessId.value}).
Chart Push and External Integration
The platform can schedule report pushes so users receive data snapshots without logging in. External products can embed BI dashboards, reports, or components via the provided SDK and pass parameters for flexible display.
Implementation Principles
SQL Generation Process
The front‑end drag‑and‑drop actions are translated into SQL components: row dimensions become GROUP BY, metrics become aggregation functions, filters become WHERE, sorting becomes ORDER BY, and pagination is handled per database. Example generated SQL:
SELECT *
FROM (
SELECT *, row_number() OVER () AS bi_rownum
FROM (
SELECT order_type AS bi_rowdim0,
COUNT(buyer_id) AS bi_metric0,
SUM(gmv) AS bi_metric1
FROM dm_zbk.da_self_analysis_demo
WHERE the_date >= '2020-01-01' AND the_date <= '2020-12-31'
GROUP BY order_type
ORDER BY bi_metric1 ASC
)
)
WHERE bi_rownum > 0 AND bi_rownum <= 10;The diagram below visualizes this flow.
Column Dimension Handling
Three approaches exist: (1) treat column dimensions as additional GROUP BY fields, (2) use database‑specific pivot functions, (3) generate conditional aggregation with IF statements. The platform adopts the third method, producing SQL such as:
SELECT SUM(IF(order_type='订单类型1', gmv, 0)) AS bi_metric0,
SUM(IF(order_type='订单类型2', gmv, 0)) AS bi_metric1,
SUM(IF(order_type='订单类型3', gmv, 0)) AS bi_metric2,
SUM(IF(order_type='订单类型4', gmv, 0)) AS bi_metric3
FROM dm_zbk.da_self_analysis_demo;When many column values exist, distinct values are fetched first and pagination can be applied.
Calculated Field Processing
Aggregate calculated fields are wrapped with try() to avoid runtime errors (e.g., division by zero). The platform uses ANTLR to parse the user‑provided SQL, inserts try() around aggregate functions, and adds conditional IF clauses based on column‑dimension values.
SELECT try(SUM(IF(order_type='订单类型3', gmv, 0))) AS bi_metric0,
try(SUM(IF(order_type='订单类型4', gmv, 0))) AS bi_metric1,
try(SUM(IF(order_type='订单类型2', gmv, 0))) AS bi_metric2,
try(SUM(IF(order_type='订单类型1', gmv, 0))) AS bi_metric3
FROM dm_zbk.da_self_analysis_demo;Conclusion and Outlook
The Youzan BI platform consolidates the data‑visualization capabilities of the Youzan data middle‑platform, offering efficient internal data analysis. Future plans include supporting richer chart types, adding advanced features such as subtotals, period‑over‑period comparisons, alerts, large‑screen dashboards, intelligent analysis, and further reducing user effort while improving interaction experience.
Youzan Coder
Official Youzan tech channel, delivering technical insights and occasional daily updates from the Youzan tech team.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
