Big Data 15 min read

How We Built a Low‑Latency Advertising Billing System with Kafka Streams

This article describes the design, implementation, and performance of ShouQianBa's advertising billing system, detailing the migration from Apache Druid to Kafka Streams, the architecture for real‑time event processing, data aggregation, persistence, fault tolerance, and the achieved low‑latency, high‑throughput metrics.

SQB Blog
SQB Blog
SQB Blog
How We Built a Low‑Latency Advertising Billing System with Kafka Streams

Online advertising has evolved into a technology‑driven model targeting specific audiences and products. ShouQianBa, after establishing a mobile payment market, launched an advertising platform and needed a robust billing system to support traffic forecasting, shaping, budget control, billing settlement, and data monitoring.

Preface

The billing system is crucial for accurate traffic prediction, shaping, budget enforcement, invoice settlement, and anomaly detection.

Traffic forecasting based on historical data

Traffic shaping to improve overall ad revenue

Budget control to avoid overspending

Automated billing settlement

Data monitoring for regular patterns and anomalies

This article focuses on ShouQianBa's explorations and practices in advertising billing.

History

The previous generation used Apache Druid on ECS, which suffered high maintenance cost, frequent failures, and difficult upgrades.

Previous Druid architecture
Previous Druid architecture

A notable failure occurred when Druid’s Historical node lost its ZooKeeper session, causing massive timeouts.

With the rise of cloud computing, a PaaS solution proved cheaper and more flexible, leading to the replacement of Apache Druid with Kafka Streams.

Overall Architecture

Ads are fetched from an ad library, filtered, and delivered to users. Events such as delivery, impression, click, and conversion are emitted, containing ad details, pricing, user/device info, and transaction data. These events are written to Kafka, serving both billing calculations and downstream analytics.

Ad event flow
Ad event flow

The billing system requires fast query response and low data update latency to influence real‑time traffic shaping and budget control.

Kafka Streams

Kafka Streams is integrated to provide stream processing and analysis of data stored in Kafka.

Kafka Streams Features

Lightweight library easily embedded in any Java application

No external dependencies beyond Kafka (v0.10+)

Offers low‑level Processor API and high‑level DSL

Leverages Kafka partitioning for horizontal scaling and ordering guarantees

Kafka Streams architecture
Kafka Streams architecture

Data Pre‑aggregation

To improve query performance, minute‑level pre‑aggregation is performed. Events are consumed from the source Kafka topic, key‑ed by ad plan ID and minute timestamp, and written to a Repartition topic to ensure that identical keys are processed by the same task.

Repartition and aggregation flow
Repartition and aggregation flow

Each task aggregates delivery, impression, click, and cost counts per key. Kafka Streams uses a time window (fixed 1‑minute) with a 4‑minute grace period to handle late‑arriving events.

final KStream<String, GenericRecord> stream = builder.stream(sourceTopic);

Serde<TitanMetrics> titanMetricsSerde = StreamsSerdes.titanMetricsSerde();

KTable<Windowed<String>, TitanMetrics> metricsKTable = stream
    // filter invalid events
    .filter((k, v) -> {
        if (v.get("事件类型") == null || v.get("事件时间戳") == null) {
            log.warn("consume invalid titan metrics message, k: {}, v: {}", k, v);
            return false;
        }
        return true;
    })
    // extract fields
    .flatMap((k, v) -> {
        List<KeyValue<String, TitanMetrics>> result = new LinkedList<>();
        String sliceId = String.valueOf(v.get("广告计划ID"));
        String event = v.get("事件类型").toString();
        Integer priceMode = (Integer) v.get("出价类型");
        Long biddingPrice = (Long) v.get("出价价格");
        TitanMetrics sm = TitanMetrics.transformSliceMetrics(sliceId, event, priceMode, biddingPrice);
        if (sm != null) {
            result.add(new KeyValue<>(sm.generateMetricsKey(), sm));
        }
        return result;
    })
    // group by key and write to repartition topic
    .groupByKey(Grouped.with(partOfRepartitionTopic, Serdes.String(), titanMetricsSerde))
    // fixed 1‑minute window with 4‑minute grace period
    .windowedBy(TimeWindows.of(Duration.ofSeconds(60L)).grace(Duration.ofMinutes(4L)))
    .reduce(TitanMetrics::sum,
        Materialized.<String, TitanMetrics, WindowStore<Bytes, byte[]>>as(partOfReduceChangelogTopic));

Data Persistence

Kafka Streams’ suppress operator delays intermediate results until the window closes, emitting only the final minute‑level aggregates to MySQL, while continuous updates are sent to Redis.

Persistence flow
Persistence flow
// final results stream
metricsKTable.suppress(Suppressed.untilWindowCloses(unbounded()).withName(partOfSuppressChangelogTopic))
    .toStream()
    .map((window, v) -> {
        Window w = window.window();
        v.setLast(true);
        v.setStartTime(w.start());
        v.setEndTime(w.end());
        return KeyValue.pair(window.key() + ":" + w.start(), v);
    })
    .to(targetTopic, Produced.with(Serdes.String(), titanMetricsSerde));

// continuous update stream
metricsKTable.toStream()
    .map((window, v) -> {
        Window w = window.window();
        v.setLast(false);
        v.setStartTime(w.start());
        v.setEndTime(w.end());
        return KeyValue.pair(window.key() + ":" + w.start(), v);
    })
    .to(targetTopic, Produced.with(Serdes.String(), titanMetricsSerde));

Data Accuracy

Kafka Streams leverages Kafka’s exactly‑once semantics and transactional mechanisms (available from Kafka 0.11) to guarantee reliable message delivery across topics.

Exactly‑once processing

Atomic operations

Stateful operation recoverability

Fault Tolerance

Kafka’s replication ensures no data loss per partition.

Windowed state is also written to a changelog topic for recovery.

Kafka’s rebalance mechanism provides high availability during node failures.

Performance Metrics

During traffic peaks, the billing service sustains around 5,000 QPS with an average response time under 3 ms and a 99th‑percentile around 13 ms.

QPS and response time
QPS and response time

CPU usage per POD container is shown below.

CPU usage
CPU usage

Summary

The core goal of advertising is low‑cost user reach. ShouQianBa’s ad platform targets post‑payment traffic, where user dwell time is short; precise, timely marketing is essential to improve conversion and eCPM. The billing system performs real‑time calculations across delivery, impression, click, and conversion events, feeding data‑driven decisions that optimize overall ad profitability. Meeting requirements for data accuracy, low latency, and high availability is key to enabling effective precision marketing.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

advertisingkafka streamsData StreamingReal-time Billing
SQB Blog
Written by

SQB Blog

Thank you all.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.