How Hologres Shared Cluster Powers Fine‑Grained Taobao Subscription Operations
This article explains how Alibaba's Hologres shared cluster enables Taobao's subscription system to perform precise content selection, improve recommendation quality, reduce data movement, and achieve sub‑second query performance for large‑scale, real‑time business scenarios.
1 Taobao Subscription Fine‑Grained Content Operations
Taobao Subscription is a double‑private‑domain product for users and merchants that complements recommendation‑based "Guess You Like" on the user side and builds a "My Likes" mindset, while on the merchant side it enables structured, automated high‑quality supply to help merchants operate fan memberships more effectively.
The initial workflow covered content publishing from merchant back‑ends, algorithmic distribution, front‑end consumption, and data feedback. To achieve finer‑grained operation and improve recommendation experience, a content‑feature selection system was built.
Feature Requirements for Recommendation
High‑quality content selection: Multi‑dimensional feature filtering for content distribution on the subscription front‑end.
Low‑quality content filtering: Features filter out pornographic, political, or meaningless content.
Feature Requirements for Content Operations
Core content display: Operators select a batch of core content for front‑end aggregation using the selection system.
Promotion atmosphere strengthening: Selected promotional content receives enhanced display during major sales.
Merchant traffic tilt: Core partner merchants' content is identified and given traffic priority on the front‑end.
2 Subscription Content Feature Selection Engine Selection
Content selection involves filtering massive multi‑dimensional metrics, requiring a precise and extensible design.
2.1 Current Architecture Design
The selection process is abstracted as content‑id + related‑id + multi‑dimensional filter, producing a set of target content IDs. It creates an activity instance containing a batch of content, configures filter schemas, and supplies filter values.
Thus the problem becomes translating filter schemas and values into executable query statements.
2.2 Engine Selection Core Requirements: Flexibility & High Performance
The selection engine must be easy to integrate, translate filter items into executable SQL, guarantee performance and stability for complex queries, and support flexible addition of new feature fields.
Simple integration to reduce translation complexity.
Performance and stability for fast response to evolving operational strategies.
Flexibility to accommodate changing feature dimensions.
After extensive research and comparison, the following table summarizes the evaluation of MaxCompute versus Hologres shared clusters.
Evaluation
MaxCompute
Hologres Shared Cluster
Flexibility
General – multi‑table joins require specifying table space.
High – can aggregate multi‑table joins within a single space.
Cost
Low
Medium – no data import/export needed; SSD cache acceleration.
Query Speed
Typical single query >15 s; stage‑file design with high fault tolerance.
Billions of rows, sub‑second query; in‑memory response.
3 Building the Subscription System with Hologres
3.1 Hologres Cluster: Less Data Movement + Faster Queries
Low usage cost: Quick instance creation, easy onboarding, and the ability to migrate to an independent cluster later.
Seamless development: SQL syntax aligns with standard SQL; one‑click table‑structure sync supports frequently changing schemas.
Reduced data movement: External tables read data from multiple MaxCompute projects, enabling cross‑project aggregation without import/export.
3.2 Efficiency Gains
Compared with MaxCompute, performance improves dramatically: billions‑level data with complex multi‑table JOINs execute in 8‑9 seconds; single‑table external queries take ~2 seconds; internal table queries ~60 ms.
UDF/expression push‑down reduces unnecessary data transfer and further boosts performance.
3.3 Best Practices for a Hologres‑Based Subscription System
The workflow is illustrated below:
Operators select filter items and values on the back‑end; the system automatically generates Hologres SQL (example below), executes it, returns results to the front‑end for display, and iteratively refines the selection based on performance.
SELECT feed_id
FROM qn_xxx_provider AS a
WHERE a.xxx_pv > 30000
AND a.xxx_pctr > '0.1'
AND a.last_publish_time >= '2022-06-17 08:00:00'
AND a.biz_xxx_code = '111'
AND a.ds = MAX_PT('xxxxxx_table')
AND CAST(a.owner_xxx_id AS VARCHAR) IN (
SELECT b.domain_xxx_id
FROM xxxxxxx_table AS b
WHERE b.rule_type = 12
AND b.channel_xxx_id = 137
AND b.dataset_xx_id = xxxxx
AND b.ds = MAX_PT('xxxxx_odps_channel')
)
AND a.feed_id IN (
SELECT feed_id
FROM xxxxx_submission_feed_hh
WHERE activity_id = 222
AND approval_status = 1
AND ds = MAX_PT('xxxxx_submission_hh')
AND hh = '13'
);4 Business Value
Using the Hologres shared cluster, the Taobao subscription system has supported over 1,000 operational selection tasks, powered major promotions such as Double 11 and 618, enabled subscription play scenarios, simplified configuration of multiple secondary pages, and eliminated data import/export, allowing teams to focus on growth.
Future plans include incorporating real‑time features via Hologres internal tables and reducing reliance on GUC parameter tuning to improve productization.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
