How StarRocks Supercharges Real‑Time Ad Funnel Monitoring and Creative Optimization
This article dissects the full advertising funnel, explains why CTR and eCPM are critical, and demonstrates how StarRocks combined with Flink can deliver minute‑level real‑time monitoring, material selection, anomaly alerts, A/B testing, and a successful migration from Druid for massive ad‑tech workloads.
1. Advertising Funnel Overview
The ad delivery process consists of request → identification → recall → bidding → win feedback → impression → click → data collection, forming a closed loop. The article emphasizes that the display‑to‑click stage, especially CTR, directly impacts eCPM and overall campaign success.
eCPM Formula
eCPM = bid * CTR * CVR * 1000High CTR reduces CPC and boosts eCPM, making it the most decisive metric for scaling.
2. Business Solution Analysis
The solution targets real‑time monitoring and creative selection in stage three, feeding insights back to earlier stages to improve recall and bidding.
Key Pain Points
Cold‑start difficulty – new accounts or creatives lack data, leading to low exposure.
Creative fatigue – CTR drops over time, reducing eCPM.
Empty runs – high impressions with few clicks waste budget.
3. Scenario 1: Real‑Time Metrics with StarRocks
Architecture
Flink consumes Kafka ad logs and writes them to a StarRocks detail table. Asynchronous materialized views aggregate metrics per minute.
Fact Table Definition
CREATE TABLE fact_ad_events (
event_time DATETIME NOT NULL,
ad_id BIGINT,
creative_id BIGINT,
campaign_id BIGINT,
creative_version STRING,
channel VARCHAR(64),
region VARCHAR(32),
user_tag VARCHAR(128),
event_type VARCHAR(20),
cost DECIMAL(18,6),
user_id BIGINT
) DUPLICATE KEY(event_time, ad_id, creative_id)
PARTITION BY RANGE(event_time) ()
DISTRIBUTED BY HASH(ad_id) BUCKETS 64
PROPERTIES ("replication_num" = "3");Materialized View (per‑minute aggregation)
CREATE MATERIALIZED VIEW agg_ad_realtime
PARTITION BY date_trunc('minute', event_time)
DISTRIBUTED BY HASH(creative_id)
REFRESH ASYNC EVERY (INTERVAL 1 MINUTE) AS
SELECT
date_trunc('minute', event_time) AS event_minute,
ad_id,
creative_id,
creative_version,
channel,
region,
user_tag,
COUNT_IF(event_type = 'show') AS impression_count,
COUNT_IF(event_type = 'click') AS click_count,
COUNT_IF(event_type = 'convert') AS convert_count,
SUM(cost) AS cost
FROM fact_ad_events
GROUP BY date_trunc('minute', event_time), ad_id, creative_id, creative_version, channel, region, user_tag;Use Cases
Creative selection : Rank new creatives with CTR ≥ 4% and sufficient impressions, then boost budget for top‑5.
CTR trend query (last 60 minutes):
SELECT event_minute, creative_id, creative_version, channel, region, user_tag,
impression_count, click_count,
ROUND(click_count * 100.0 / NULLIF(impression_count,0),2) AS ctr_percent,
ROUND(convert_count * 100.0 / NULLIF(click_count,0),2) AS cvr_percent
FROM agg_ad_realtime
WHERE event_minute >= NOW() - INTERVAL 60 MINUTE
ORDER BY event_minute DESC, ctr_percent DESC;Anomaly detection : Identify CTR drops >20% using a moving average window.
WITH ctr_trend AS (
SELECT creative_id,
ROUND(click_count*100.0/NULLIF(impression_count,0),2) AS ctr_percent,
AVG(ROUND(click_count*100.0/NULLIF(impression_count,0),2))
OVER (PARTITION BY creative_id ORDER BY event_minute ROWS BETWEEN 10 PRECEDING AND CURRENT ROW) AS moving_avg
FROM agg_ad_realtime
WHERE event_minute >= NOW() - INTERVAL 30 MINUTE
)
SELECT * FROM ctr_trend
WHERE (moving_avg - ctr_percent) / moving_avg > 0.20
ORDER BY (moving_avg - ctr_percent) DESC;Empty‑run warning : Flag creatives with impressions >10 000 but clicks < 50 in the last 10 minutes.
SELECT creative_id, creative_version, impression_count, click_count,
ROUND(click_count*100.0/NULLIF(impression_count,0),2) AS ctr_percent
FROM agg_ad_realtime
WHERE impression_count > 10000 AND click_count < 50
AND event_minute >= NOW() - INTERVAL 10 MINUTE;4. Scenario 2: Creative A/B Testing with StarRocks
Architecture
Same Flink + StarRocks pipeline, but the materialized view aggregates by creative_version to support multi‑dimensional comparison.
CREATE MATERIALIZED VIEW agg_ad_abtest
PARTITION BY event_minute
DISTRIBUTED BY HASH(creative_id, creative_version)
REFRESH ASYNC EVERY (INTERVAL 1 MINUTE) AS
SELECT date_trunc('minute', event_time) AS event_minute,
campaign_id, creative_id, creative_version, channel, region, user_tag,
COUNT_IF(event_type='show') AS impression_count,
COUNT_IF(event_type='click') AS click_count,
COUNT_IF(event_type='convert') AS convert_count
FROM fact_ad_events
GROUP BY date_trunc('minute', event_time), campaign_id, creative_id, creative_version, channel, region, user_tag;Multi‑dimensional comparison
SELECT creative_version, channel, region, user_tag,
SUM(impression_count) AS impressions,
SUM(click_count) AS clicks,
ROUND(SUM(click_count)*100.0/NULLIF(SUM(impression_count),0),2) AS ctr_percent,
ROUND(SUM(convert_count)*100.0/NULLIF(SUM(click_count),0),2) AS cvr_percent
FROM agg_ad_abtest
WHERE event_minute >= NOW() - INTERVAL 24 HOUR
GROUP BY creative_version, channel, region, user_tag
ORDER BY ctr_percent DESC;Statistical significance (Chi‑square)
WITH stats AS (
SELECT creative_version,
SUM(impression_count) AS imp,
SUM(click_count) AS clk
FROM agg_ad_abtest
WHERE campaign_id = 1234
AND creative_version IN ('V1_old','V1_new','V2_old','V2_new')
AND event_minute >= NOW() - INTERVAL 12 HOUR
GROUP BY creative_version
)
SELECT a.creative_version AS winner, b.creative_version AS loser,
ROUND(a.clk*100.0/a.imp,2) AS ctr_a,
ROUND(b.clk*100.0/b.imp,2) AS ctr_b,
POWER(a.clk - a.imp*p,2)/(a.imp*p) + POWER(b.clk - b.imp*p,2)/(b.imp*p) AS chi_square
FROM stats a CROSS JOIN stats b
CROSS JOIN (SELECT SUM(clk)*1.0/SUM(imp) AS p FROM stats) total
WHERE a.creative_version <> b.creative_version
AND chi_square > 0.003 AND ctr_a > ctr_b;Dayparting analysis
SELECT HOUR(event_time) AS hour_of_day, creative_version,
COUNT(CASE WHEN event_type='show' THEN 1 END) AS impressions,
COUNT(CASE WHEN event_type='click' THEN 1 END) AS clicks,
ROUND(clicks*100.0/NULLIF(impressions,0),2) AS ctr_percent
FROM fact_ad_events
WHERE campaign_id = 1234
AND event_time >= DATE_SUB(NOW(), INTERVAL 7 DAY)
GROUP BY hour_of_day, creative_version
ORDER BY hour_of_day, creative_version;5. Case Study: Pinterest Report Reconstruction
Pinterest migrated its Partner Insights analytics from Druid to StarRocks. The migration replaced JSON‑based DSL with standard SQL, introduced asynchronous materialized views, and leveraged StarRocks’ vectorized engine.
Benefits
Performance: P90 query latency reduced by ~50%.
Cost: Instance count cut by 68% (only 32% of original).
Freshness: Data latency improved from minutes to ~10 seconds.
Operational simplicity: Standard SQL eliminated complex JSON configs, speeding up onboarding.
6. Conclusion and Outlook
The article concludes that real‑time, append‑only analytics are essential for ad‑tech. By adopting StarRocks, teams can turn raw impression and click streams into actionable, minute‑level decisions, dramatically lowering cold‑start costs, boosting operational efficiency, and enabling rapid creative optimization. The next article will address conversion attribution and real‑time ROI calculation.
StarRocks
StarRocks is an open‑source project under the Linux Foundation, focused on building a high‑performance, scalable analytical database that enables enterprises to create an efficient, unified lake‑house paradigm. It is widely used across many industries worldwide, helping numerous companies enhance their data analytics capabilities.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
