Databases 16 min read

How ByteHouse Boosted Sales Data Platform Queries Up to 16× with ACL and Optimizer

This article examines a fast‑growing company's sales data platform, outlines the data‑access pain points caused by ACL permissions, describes the migration from ClickHouse to ByteHouse, details the optimizer’s rule‑based, cost‑based, and distributed‑plan enhancements, and presents benchmark results showing query speedups of up to sixteen times.

ByteDance Data Platform
ByteDance Data Platform
ByteDance Data Platform
How ByteHouse Boosted Sales Data Platform Queries Up to 16× with ACL and Optimizer

Business Background

A rapidly expanding company unified its scattered sales data into a visual analytics platform to ensure data consistency, quality, and controlled access.

Pain Points

Sales teams faced inconsistent data visibility across the platform.

Data product managers and engineers incurred high maintenance costs for row‑level permissions.

Compliance required strict data‑access controls to limit exposure of customer information.

ACL‑Based Access Control

The team introduced an ACL (Access Control List) model to centralize permission management, allowing fine‑grained control over who can view or modify specific sales data.

ByteHouse Overview

ByteHouse, an OLAP‑oriented analytical database built on ClickHouse, offers architectural upgrades and optimizations for complex queries. After migrating to ByteHouse, the company achieved up to 16× efficiency gains in certain scenarios.

Query Examples Before and After ACL

Before ACL, a typical query looked like:

<code>SELECT (concat(one_year,'-',case when substring(one_quarter,5)='01' then 'Q1' when substring(one_quarter,5)='02' then 'Q2' when substring(one_quarter,5)='03' then 'Q3' when substring(one_quarter,5)='04' then 'Q4' else null end)) AS _1700010720846,
       sum(CAST(daily_value_usd/100000 AS Nullable(Float64))) AS _sum_1700010720933
FROM aeolus_data_db.aeolus_data_table
WHERE p_date >= '2024-02-20' AND p_date <= '2024-02-20' AND arraySetCheck(emp_ids,(1527))
GROUP BY concat(one_year,'-',case when substring(one_quarter,5)='01' then 'Q1' when substring(one_quarter,5)='02' then 'Q2' when substring(one_quarter,5)='03' then 'Q3' when substring(one_quarter,5)='04' then 'Q4' else null end)
LIMIT 10000;</code>

After introducing ACL, the query adds a sub‑query to filter rows based on permission tables:

<code>SELECT (concat(one_year,'-',case when substring(one_quarter,5)='01' then 'Q1' when substring(one_quarter,5)='02' then 'Q2' when substring(one_quarter,5)='03' then 'Q3' when substring(one_quarter,5)='04' then 'Q4' else null end)) AS _1700010720846,
       sum(CAST(daily_value_usd/100000 AS Nullable(Float64))) AS _sum_1700010720933
FROM (
    SELECT * FROM aeolus_data_db.aeolus_data_table
    WHERE oid IN (
        SELECT oid FROM db.auths
        WHERE (arraySetCheck(emp_ids,(1527))
               OR arraySetCheck(dept_ids,(123456,234567,345678,456789))
               OR arraySetCheck(sales_owner_dept_ids,(13579,246810))
               OR arraySetCheck(queue_ids,(6969968376940593921))
               OR arraySetCheck(auth_code_ids,(654321,54321,86421)))
          AND is_active = 0)
) AS filtered
WHERE p_date >= '2024-02-20' AND p_date <= '2024-02-20'
GROUP BY concat(one_year,'-',case when substring(one_quarter,5)='01' then 'Q1' when substring(one_quarter,5)='02' then 'Q2' when substring(one_quarter,5)='03' then 'Q3' when substring(one_quarter,5)='04' then 'Q4' else null end)
LIMIT 10000;</code>

Optimizer and MPP Execution Model

ByteHouse’s optimizer rewrites queries using rule‑based (RBO) and cost‑based (CBO) techniques, supports global joins, and transforms distributed plans to reduce data shuffling. The execution flow includes PlanSegment, Optimizer, Scheduler, Exchange, and PipelineExecutor components.

Four Major Optimizations

RBO : column pruning, partition pruning, predicate push‑down, sub‑query de‑correlation, join reordering, and operator push‑down.

CBO : cost‑based join enumeration, histogram‑driven cost estimation, and support for complex join/aggregate reorderings.

Distributed Plan Optimization : integrates single‑node plan generation with distributed execution, enabling global joins and efficient data placement.

Advanced Optimizations : dynamic filter push‑down, materialized‑view rewrite, CTE sharing, plan reuse, and result reuse.

Performance Results

Benchmark tests show significant improvements:

Non‑ACL queries: average latency reduced from 5.9 s to 2.3 s.

ACL‑protected queries for 60 M advertising customers: latency dropped from 53 s to 7 s (≈ 7.5×) and from 16 s to 2 s (≈ 8×).

Overall query latency improvements up to 16× in complex scenarios.

Conclusion

By addressing ACL‑induced performance bottlenecks with ByteHouse’s advanced optimizer and MPP execution model, the company achieved substantial query speedups, demonstrating ByteHouse’s suitability for demanding OLAP workloads.

performanceQuery OptimizationOLAPACLByteHouse
ByteDance Data Platform
Written by

ByteDance Data Platform

The ByteDance Data Platform team empowers all ByteDance business lines by lowering data‑application barriers, aiming to build data‑driven intelligent enterprises, enable digital transformation across industries, and create greater social value. Internally it supports most ByteDance units; externally it delivers data‑intelligence products under the Volcano Engine brand to enterprise customers.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.