Big Data 13 min read

JD Real-Time Data Product Practice: Overview, Low‑Code Platform, Stream‑Batch Integration, and Operations

This article summarizes JD's real‑time data product practice, covering product overview, low‑code real‑time platform construction, stream‑batch integrated architecture, and the three‑layer operational defense model, while highlighting challenges, evolution, user distribution, and future directions.

DataFunSummit

Feb 16, 2023

JD Real-Time Data Product Practice: Overview, Low‑Code Platform, Stream‑Batch Integration, and Operations

01 JD Real‑Time Product Overview

JD's real‑time data products support a wide range of business scenarios across retail, logistics, health, etc., including real‑time warehouses, dashboards, recommendations, reports, risk control, and monitoring. The first generation started in 2014 on Storm, evolved through Kafka (2017) and SQL‑based development (2019), and now focuses on low‑code and stream‑batch integration.

Key points include the product's development history, user distribution (over 50% software and algorithm engineers, followed by product managers and analysts), and current issues such as high entry barriers and the split between offline and online data pipelines.

02 Low‑Code Real‑Time Platform Construction

The platform faces the challenge of balancing complex business requirements with a simple product model. By standardizing data sources, modeling, and output modes, the team modularizes functionality to enable low‑code development.

Three main business scenarios are identified: statistical real‑time applications (dashboards, reports), detail‑type data business (storing real‑time results for downstream offline analysis), and complex event processing (real‑time algorithms, tagging, risk control, alerts). Standardized processing includes data source standardization, model processing standardization, and output mode standardization.

Product characteristics: configuration‑driven real‑time development, an integrated ecosystem linking real‑time products with reporting, metric, and algorithm platforms, and technology enablement via API exposure.

03 Stream‑Batch Integrated Product System

Stream‑batch integration unifies Flink‑based real‑time and batch processing, allowing a single script to serve both streaming and batch workloads and storing results in a unified lake‑warehouse. This reduces development, maintenance, and consistency costs.

Use cases include real‑time warehouse construction (supporting core business such as advertising and search) and real‑time risk‑control sentiment analysis, achieving sub‑minute latency and over 30% cost reduction.

04 Product Operations – Three Defense Lines

The operational model consists of pre‑risk assessment (chaos engineering, stress testing), in‑process health monitoring (fault attribution, auto‑scaling, emergency plans), and post‑incident automatic recovery. Metrics such as fault rate, recovery time, and readiness duration evaluate effectiveness.

Future plans focus on expanding low‑code scenarios, building real‑time ETL tools for end‑to‑end integration, and enhancing automatic diagnosis and optimization to lower operational costs.

For more details, see the accompanying images and the original presentation video.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data stream processing real-time data Low‑code platform

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.