How Materialized Views Supercharge Alibaba Cloud Log Service Queries
When log volumes explode from gigabytes to petabytes, Alibaba Cloud Log Service’s traditional on‑the‑fly querying becomes slow, resource‑hungry, and inaccurate, but materialized views pre‑compute and store results, delivering seconds‑level responses with far lower resource consumption.
Alibaba Cloud Log Service faces severe performance bottlenecks as log data grows from GB to TB/PB scale: queries take dozens of seconds, concurrent users cause resource contention, and resource limits force approximate results.
Materialized View Solution
The service introduces materialized views that pre‑compute high‑frequency query results and store them as snapshots. When a user issues a query, the engine reads the pre‑computed data instead of scanning raw logs, achieving tens to hundreds of times faster response times and more stable, accurate results.
Core Acceleration Scenarios
Filter Acceleration: Only logs matching certain criteria (e.g., error level) are materialized, allowing subsequent error‑only queries to run on a much smaller dataset.
Pre‑aggregation Acceleration: Aggregated statistics (e.g., hourly user visits) are computed periodically and stored, turning multi‑billion‑row scans into a few‑hundred‑row reads.
Key Advantages
Asynchronous Refresh: Incremental updates run in the background without affecting write performance.
Automatic Data Merge: New data is merged with existing materialized results transparently.
Complex Aggregation Support: Functions such as count(distinct), approx_percentile, and approx_distinct are supported.
Dynamic View Updates: Changing the view definition does not require rebuilding existing materialized data.
Transparent Rewrite: The optimizer automatically rewrites queries to use materialized results, adding predicates like latency > 100 without user intervention.
Optimizer Behavior
If a matching materialized view exists, the cost‑based optimizer selects the optimal one.
For non‑aggregated queries, it performs a lightweight UNION of raw and materialized data.
For aggregated queries, it merges real‑time aggregates with pre‑computed results.
SQL Examples
level:error | select latency, host from log where message like '%xxx%' level:error and latency > 100 | select avg(latency), host from log where message like '%xxx%' group by hostCreating a materialized view:
*| select avg(latency) as avg_latency, date_trunc('hour', __time__) as time from log group by time *| select sum(InFlow) as in_flow, sum(OutFlow) as out_flow, avg(latency) as latency, ProjectId, RequestType, Status from log group by ProjectId, RequestType, StatusPerformance Case Study
In a dashboard scenario requiring second‑level latency, enabling materialized views reduced query times from minutes to seconds. Specific benchmarks:
Hourly average latency comparison: materialized view queries returned within seconds; non‑materialized queries timed out.
Project‑level I/O and latency stats: materialized view returned in ~400 ms versus ~54 s without.
Status‑and‑Project latency >200 ms: materialized view returned in ~800 ms while the raw query timed out.
Across billions of rows, materialized view charts opened instantly, whereas raw SQL required >50 s or failed.
Future Directions
Intelligent Recommendation: Auto‑detect high‑frequency patterns and suggest optimal materialized views.
Extended Scenarios: Support join‑based materialized views and data‑deletion use cases.
Rewrite Enhancements: Enable non‑exact expression matching for broader query rewrite coverage.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
