Real‑time Data Platform (RTDP): Concepts, Architecture and Design Considerations
This article examines the design of a real‑time data platform, discussing its background concepts, modern data‑warehouse perspective, architectural layers, unified data‑collection, streaming, compute and visualization platforms, and the functional, quality, stability, cost and agility considerations required for building an end‑to‑end real‑time pipeline.
The article introduces a two‑part series on a crucial and common big‑data infrastructure called the Real‑time Data Platform (RTDP). The first part focuses on design, viewing RTDP from modern data‑warehouse and typical data‑processing perspectives, while the second part will cover technology selection and component details.
1. Related Concepts
From a modern data‑warehouse viewpoint, RTDP inherits traditional warehouse modules but adds multi‑source ingestion, near‑real‑time processing (T+0), diverse usage patterns, and advanced capabilities such as data real‑timeization, virtualization, democratization, and collaboration.
Key capabilities identified are:
Data real‑timeization (synchronization and stream processing)
Data virtualization (virtual compute and unified services)
Data democratization (visual UI and self‑service)
Data collaboration (multi‑tenant and workflow sharing)
From a typical data‑processing angle, the article contrasts OLTP and OLAP, describing the data pipeline (ETL) that moves data from transactional to analytical stores and the challenges of achieving low latency (OLPP – Online Pipeline Processing).
2. Architectural Design
RTDP aims to provide end‑to‑end real‑time processing (millisecond/second/minute latency), supporting multi‑source ingestion, real‑time consumption, and the four core capabilities mentioned above.
The overall conceptual architecture is divided into four unified layers:
Unified Data Collection Platform
Unified Stream Processing Platform
Unified Compute Service Platform
Unified Data Visualization Platform
Each layer is described in detail:
2.1 Unified Data Collection Platform supports full and incremental extraction, uses a custom Unified Message Schema (UMS) to decouple messages from physical transports, and offers multi‑tenant, configurable cleaning.
2.2 Unified Stream Processing Platform consumes UMS or JSON messages, provides visual/configurable/SQL‑based development, idempotent writes to heterogeneous targets, and multi‑tenant isolation.
2.3 Unified Compute Service Platform implements data virtualization/federation, supports push‑down and pull‑up computation across heterogeneous sources, and exposes unified JDBC/REST interfaces with SQL.
2.4 Unified Data Visualization Platform adds multi‑tenant governance and enables cross‑department collaboration through visual tools.
3. Specific Issues and Design Thoughts
Functional considerations discuss which ETL operators are supported in streaming (e.g., left‑join, union, filter, map, project) and the need for hybrid processing to handle complex logic.
Quality considerations mention Lambda and Kappa architectures, data consistency, and future discussions on data‑quality mechanisms.
Stability considerations cover high availability, SLA guarantees, elastic resilience, monitoring, automated operations, and metadata change resistance.
Cost considerations address labor, resource utilization, operational expenses, and reducing trial‑and‑error through agile development.
Agility emphasizes configuration, SQL‑based development, and democratization, while management focuses on metadata and data‑security governance.
The article concludes that the presented RTDP design offers a comprehensive, modular, and extensible blueprint for building modern real‑time data platforms, with a forthcoming technical part that will detail concrete technology choices and open‑source implementations.
Author: Lu Shanwei
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
