Cloud Native 14 min read

How RocketMQ 5.0 Enables Lightweight Cloud‑Native Stream Processing with RStreams and RSQLDB

This article explains the evolution of message middleware, introduces core concepts of stream processing, and details RocketMQ 5.0's native lightweight stream engine RStreams and its stream database RSQLDB, showing how they simplify real‑time data integration, computation, and scaling in cloud‑native environments.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
How RocketMQ 5.0 Enables Lightweight Cloud‑Native Stream Processing with RStreams and RSQLDB

Background

Message middleware has evolved for more than three decades, moving from early open‑source queues to IoT, cloud computing, and cloud‑native architectures. RocketMQ 5.0, released in 2022, adopts a cloud‑native design and expands coverage of business scenarios.

Stream‑Processing Fundamentals

Stream processing consists of three stages:

Stream data : continuous, ordered, unbounded records (e.g., credit‑card transactions, stock trades, IoT sensor readings).

Stream storage : append‑only, immutable logs with partition‑ and offset‑based reads. Typical implementations include RocketMQ, Apache Kafka, and AWS Kinesis.

Stream computation : low‑latency, stateful processing engines such as Apache Flink, Spark Streaming, and Kafka KStreams.

Typical use cases require real‑time response, such as fraud detection, algorithmic trading, equipment monitoring, and sentiment analysis.

RStreams – RocketMQ’s Native Lightweight Stream Engine

RStreams is a built‑in stream computation engine introduced in RocketMQ 5.0. It is designed for edge and micro‑service scenarios and relies exclusively on RocketMQ’s native stack.

Key Characteristics

Uses RocketMQ Topic types (Source, Shuffle, Compact, Sink) to implement data‑flow processing without an external stream‑processing platform.

Developers write logic with the RStreams SDK and embed it directly into applications or micro‑services.

Provides a full operator set: stateless operators (filter, map) and stateful operators (aggregation, windowing).

Data Flow

The engine follows three logical steps: input → transformation → output . A typical WordCount program reads sentences from a source topic, splits them into words, groups by word key using a Shuffle topic, counts occurrences within tumbling windows, and writes results to a sink topic.

Shuffle Topic (Data Exchange)

RStreams creates a Shuffle topic to ensure that records sharing the same key are routed to the same computation instance. The key is hashed to a specific partition, and RocketMQ’s consumer‑load‑balancing guarantees that all records for that key are processed by a single instance, enabling correct aggregation.

State Management

Fault‑tolerance : Checkpoints are realized by replaying offsets from the source topic, allowing a failed instance to recover its processing state.

Stateful computation : Local state is stored in RocksDB for low‑latency reads/writes. Remote state is persisted in a Compact topic; periodic synchronization between RocksDB and the Compact topic enables recovery after node loss or redeployment.

Scalable Parallelism and Elastic Scaling

RStreams inherits RocketMQ’s unlimited partitioning and consumer‑group rebalancing:

Data‑level scaling: increasing the number of partitions of the source topic spreads the input load.

Compute‑level scaling: adding or removing RStreams instances triggers consumer‑group rebalancing, which reassigns source partitions and the corresponding state partitions consistently.

During scaling events, both source and state topics are reassigned so that each instance processes the same key space, preserving result correctness.

RSQLDB – Stream Database Built on RStreams

RSQLDB offers a SQL‑based interface for continuous queries over dynamic streams. It maps RocketMQ topics to relational‑style tables, allowing users to define tables, create views that join multiple streams, and execute declarative SQL statements for filtering, windowing, and aggregation.

Architecture

RSQLDB combines three layers:

Bottom layer: RocketMQ stream storage (topics) and RStreams’ atomic stream‑processing operators.

Middle layer: SQL parser that translates DDL/DML statements into a physical stream‑processing topology.

Top layer: Clients (SDK, console, CLI) that submit SQL statements.

Typical SQL Example

-- Create tables that map to source topics
CREATE TABLE ticket_purchase (
    ticket_id STRING,
    user_id STRING,
    event_time BIGINT,
    PROPERTIES ('topic'='ticket_purchase_topic')
);

CREATE TABLE user_info (
    user_id STRING,
    user_name STRING,
    PROPERTIES ('topic'='user_info_topic')
);

-- Join the two streams on user_id and write to an output topic
CREATE VIEW purchase_detail AS
SELECT t.ticket_id, u.user_name, t.event_time
FROM ticket_purchase AS t
JOIN user_info FOR SYSTEM_TIME AS OF PROCTIME() AS u
ON t.user_id = u.user_id;

CREATE TABLE purchase_detail_sink (
    ticket_id STRING,
    user_name STRING,
    event_time BIGINT,
    PROPERTIES ('topic'='purchase_detail_sink')
);

INSERT INTO purchase_detail_sink SELECT * FROM purchase_detail;

This example demonstrates how a multi‑stream join can be expressed with a few SQL statements, eliminating the need for hand‑coded SDK logic.

Technical Summary

RStreams provides a lightweight, cloud‑native stream computation engine that leverages RocketMQ’s native topics for input, shuffle, state, and output. It offers functional‑style APIs, fault‑tolerant checkpointing via offset replay, RocksDB‑backed local state synchronized with Compact topics, and elastic scaling through RocketMQ’s partitioning and consumer‑group rebalancing. RSQLDB builds on RStreams to expose a declarative SQL interface, enabling continuous queries, windowed aggregations, and stream joins without managing separate stream‑processing clusters.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

stream processingReal-time analyticsRocketMQRSQLDBRStreams
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.