Databases 21 min read

Optimizing Hologres Data Tables and Queries for Alibaba Advertising Inventory Management

By redesigning Hologres tables with column orientation, shard‑controlled Table Groups, distribution and clustering keys, adding bitmap indexes, refreshing statistics, caching external data, and tuning optimizer join order and resource scaling, Alibaba’s Mom advertising inventory system cut query latency by up to 35 % and memory use by 98 %, achieving a 5‑10× performance boost.

Alimama Tech
Alimama Tech
Alimama Tech
Optimizing Hologres Data Tables and Queries for Alibaba Advertising Inventory Management

This article describes the performance optimization of the Alibaba Mom advertising inventory management system, which relies on the Hologres real‑time data warehouse. The goal was to improve the calculation speed of inventory queries during large‑scale promotional events (e.g., Double‑11), achieving a 5‑10× boost in query performance.

System Overview The system uses Hologres as the underlying compute engine, executing most logic via SQL. It handles two main stages: data‑table construction and data‑query processing, covering table creation, indexing, distribution, and query execution.

1. Data‑Table Construction Optimization

Hologres performance is tightly linked to table design. The article introduces the concept of Table Groups (TG) to control shard count and parallelism. Example TG creation statements:

-- 创建 64个Shard的TG
CALL HG_CREATE_TABLE_GROUP('inquiry_system_texiu_tg', 64);
-- 创建 128个Shard的TG
CALL HG_CREATE_TABLE_GROUP('inquiry_system_youku_tg', 128);

Key table properties include:

orientation: column (column‑store for analytical queries)

distribution_key: fields used for data sharding (e.g., session_id and target_date )

clustering_key: sorted columns for range filters (e.g., target_date , priority )

bitmap_columns: bitmap indexes for low‑cardinality columns (e.g., adzone_id , target_date , priority )

Final CREATE TABLE statements (simplified):

--库存预估表(简化)
BEGIN;
CREATE TABLE public.predict_base_table (
    session_id text NOT NULL,
    adzone_id bigint NOT NULL,
    pv double precision,
    target_date bigint NOT NULL
);
CALL set_table_property('public.predict_base_table','bitmap_columns','adzone_id,target_date');
CALL set_table_property('public.predict_base_table','clustering_key','target_date:asc');
CALL set_table_property('public.predict_base_table','table_group','inquiry_system_texiu_tg');
CALL set_table_property('public.predict_base_table','distribution_key','session_id,target_date');
CALL set_table_property('public.predict_base_table','orientation','column');
COMMIT;

--库存占量表(简化)
BEGIN;
CREATE TABLE public.locked_pv_table (
    session_id text NOT NULL,
    target_date bigint NOT NULL,
    business_id bigint NOT NULL,
    product_priority bigint NOT NULL
);
CALL set_table_property('public.locked_pv_table','bitmap_columns','target_date,product_priority');
CALL set_table_property('public.locked_pv_table','clustering_key','target_date:asc,product_priority:asc');
CALL set_table_property('public.locked_pv_table','table_group','inquiry_system_texiu_tg');
CALL set_table_property('public.locked_pv_table','distribution_key','session_id,target_date');
CALL set_table_property('public.locked_pv_table','orientation','column');
COMMIT;

2. Query Optimization

The article explains the three‑phase SQL execution in Hologres (Optimization → Start Query → Get Result) and shows how to analyze execution plans using HoloWeb or EXPLAIN / EXPLAIN ANALYZE . Example query:

SELECT  adzone_id,
        target_date,
        sum(pv) AS all_pv_num,
        sum(CASE WHEN lock_pv_session IS NOT NULL THEN pv ELSE 0 END) AS locked_pv_num
FROM (
    SELECT  base_pv.adzone_id,
            base_pv.target_date,
            pv * pv_ratio AS pv,
            ecpm2 AS cost,
            locked_pv.session_id AS lock_pv_session,
            locked_pv.member_id AS member_id
    FROM    predict_base_table base_pv
    LEFT JOIN locked_pv_table locked_pv
      ON  base_pv.session_id = locked_pv.session_id
      AND base_pv.target_date = locked_pv.target_date
    WHERE   base_pv.adzone_id IN (123123)
      AND   base_pv.target_date IN (20211227,20211229,20211228,20211230,20211231)
) aa
GROUP BY target_date;

Performance analysis focuses on three cost components:

Optimization Cost – time to generate the execution plan.

Start Query Cost – time waiting for resources.

Get Result Cost – actual execution time.

Key optimization techniques include:

Running ANALYZE on large tables to refresh statistics.

Caching external MaxCompute tables inside Hologres to avoid costly external‑table accesses.

Adjusting the optimizer join order algorithm (e.g., setting optimizer_join_order=exhaustive at database level).

Modifying TG shard count or scaling the cluster to alleviate resource contention.

Reordering joins, merging small dimension tables, and removing heavy UNNEST / ARRAY operations.

Applying pre‑filtering, sparse‑data compression, and pre‑computations to reduce memory blow‑up.

--数据库级别修改算法的参数,生效后数据库默认算法被调整
ALTER DATABASE db_name set optimizer_join_order=exhaustive;

--链接级别修改算法的参数,当前jdbc链接断开,参数失效
set optimizer_join_order=exhaustive;

After applying these measures, the system achieved significant reductions in latency and memory usage across multiple business scenarios (e.g., Brand Special Show, Brand Youku, Brand Search). Average request time dropped from 40 s to 25 s (‑35 %), and memory consumption for a single parallel job fell from 700 GB to 14 GB (‑98 %).

3. Results and Conclusion

The optimization was a holistic effort covering table design, index selection, execution‑plan tuning, resource scaling, and business‑logic adjustments. The article concludes that continuous performance tuning, combined with deep business understanding, is essential for large‑scale advertising inventory systems.

advertisingdata warehouseHologresSQL OptimizationDatabase Performance
Alimama Tech
Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.