How OAX Revolutionizes Open Analysis in Kuaishou’s Data Platform
This article introduces OAX (Open Analysis eXpressions), Kuaishou’s unified open‑analysis language, detailing its design background, guiding principles, five‑layer language model, syntax—including data types, compute capabilities and five analysis elements—its access protocol, runtime architecture, optimization steps, and the benefits it brings to the company’s big‑data analytics ecosystem.
OAX Overview
OAX (Open Analysis eXpressions) is Kuaishou’s open analysis language that builds on the metric middle‑platform, providing a unified query engine for BI, topic and custom products.
Design Background
Kuaishou previously had siloed analysis products, each with its own query service and language, leading to high maintenance cost, lack of standards and low integration efficiency. The solution is a unified, open architecture.
Design Principles
规范 (Standardization) : Define modeling, syntax and access protocols.
统一 (Unification) : Use a single query service and language for all analysis products.
开放 (Openness) : Allow external products to access the service and support extensible operators.
OAX Language Model
The model consists of five layers: data source, data table, data model, dataset, and data application.
Data source layer – manages raw sources.
Data table layer – imports physical tables, accelerates them, and materializes heavy views.
Data model layer – defines relationships (relational or automated modeling).
Dataset layer – abstracts models and adds computed columns.
Data application layer – external systems submit OAX queries via a unified access protocol.
OAX Syntax Design
Data Types
Four primitive types: text, numeric, date, boolean.
Compute Capabilities
Basic compute – numeric, string, date, type, logical, aggregate, metric, advanced functions.
Dynamic granularity – EXCLUDE, INCLUDE, FIXED to change grain during calculation.
Table compute – window, running, offset functions.
Five Analysis Elements
Dimension, Metric, Dataset, Time range, Filter condition (and optional aggregation). Example: GDP of each city in Hubei for 2020‑2021 with proportion.
Access Protocol
Requests are expressed with OAX syntax, results are returned in Apache Arrow‑wrapped Dataframe, and calls can be made via RPC, ADBC (JDBC‑like) or HTTP.
OAX Runtime Design
Initial Logic Construction
Transforms the five analysis elements into a logical plan containing high‑level operators and dataset references.
Query Orchestration
Expands high‑level operators (e.g., proportion) into sub‑queries and unfolds datasets into physical tables using model information.
Query Optimization
Rule‑based and cost‑based optimizations (predicate push‑down, join ordering).
Computation routing – decide whether to push the whole plan to the engine or execute part in memory.
Engine‑specific plan rewrites (e.g., ClickHouse local query).
Pipeline Execution
Optimized physical plan is submitted; results may be loaded into memory and further processed with DataFusion before returning to the user.
Summary & Benefits
OAX unifies Kuaishou’s previously siloed analysis services, improves development and usage efficiency, and establishes an open ecosystem that supports diverse BI needs.
Kuaishou Big Data
Technology sharing on Kuaishou Big Data, covering big‑data architectures (Hadoop, Spark, Flink, ClickHouse, etc.), data middle‑platform (development, management, services, analytics tools) and data warehouses. Also includes the latest tech updates, big‑data job listings, and information on meetups, talks, and conferences.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.