Big Data 17 min read

Cold‑Hot Data Tiering and Performance Optimization in Apache Doris for JD Advertising

This article presents JD Advertising's engineering experience with Apache Doris, describing the evolution from a data‑lake cold‑data solution to a native cold‑hot tiering approach, detailing performance regressions after upgrading to Doris 2.0, and outlining a series of optimizations for query speed, CPU and memory usage, schema‑change efficiency, and automated data migration and restoration.

JD Tech
JD Tech
JD Tech
Cold‑Hot Data Tiering and Performance Optimization in Apache Doris for JD Advertising

Background: JD Advertising built an ad‑data storage service on Apache Doris, accumulating nearly 1 PB of data (18 trillion rows) and handling over 80 million daily queries. Rapid data growth created storage bottlenecks, prompting a need for a cold‑hot tiering solution to reduce storage and operational costs.

Cold‑Hot Tiering V1 – Data Lake: Cold data was exported from Doris via the Spark‑Doris‑Connector (SDC) to an external lake (e.g., Iceberg). Queries were split: hot queries ran directly on Doris OLAP tables, while cold queries were rewritten to access external tables. This decoupled cold‑data processing but introduced ETL overhead, UNION operations across two stores, and schema‑change complications.

Cold‑Hot Tiering V2 – Native Doris Tiering: Doris 1.2 offered a TTL‑based tiering that moves cold data to cheaper disks, but it was limited to bare‑metal deployments and required manual capacity estimation. Doris 2.0 introduced support for external distributed storage (OSS, HDFS) with configurable storage policies, simplifying the architecture while requiring careful throttling of cold queries.

Problem Solving – Performance & Resource Issues in Doris 2.0: Query performance dropped ~50% due to a new optimizer; the team disabled it and addressed bucket‑pruning failures (PR #38565 ). Prefix‑index failures caused by Date‑to‑DateTime casts were fixed (PR #39446 ). FE CPU usage doubled; flame‑graph analysis led to multiple optimizations, including limiting unnecessary rewrite rules and improving materialized‑view handling (PR #40000 ). BE memory pressure from an oversized SegmentCache was mitigated by tuning cache thresholds, reducing resident memory from >60% to <25%.

Cold‑Data Schema Change Optimizations: Implemented Linked Schema Change using ChubaoFS CopyObject to avoid data movement, achieving a 40× speedup (PR #40963 ). Introduced single‑leader SC for cold data replicas, preventing redundant copies. Extended Light Schema Change to support adding Key columns, enabling millisecond‑level SC for suitable tables.

Other Solutions – Data Migration & Restoration: Developed a data‑migrator tool to asynchronously backup large‑scale historical data to external storage, and a narwal_cli utility to align schema differences during restore, handling real‑time write conflicts (error LOAD_RUN_FAIL; msg:errCode = 2, detailMessage = Table xxxxx is in restore process. Can not load into it , fixed in PR #39595 ).

Summary: By applying cold‑hot tiering, JD Advertising reduced storage costs by ~87%, increased concurrent query capacity >10×, and aligned Doris 2.0 performance with the previous 1.2 release, while simplifying operations and lowering overall system complexity.

performance optimizationBig DataData LakeApache Doriscold-hot tieringschema change
JD Tech
Written by

JD Tech

Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.