Big Data 30 min read

Real-Time Stream Computing: Concepts, Challenges, and Tencent Cloud Solutions

As mobile and IoT data surge, real-time stream computing—especially Flink’s low-latency, high-throughput, exactly-once engine—addresses challenges of latency, accuracy, and usability, and Tencent Cloud’s managed Flink service provides elastic, secure, integrated pipelines for applications ranging from online status monitoring to fraud detection and smart transportation.

Tencent Cloud Developer
Tencent Cloud Developer
Tencent Cloud Developer
Real-Time Stream Computing: Concepts, Challenges, and Tencent Cloud Solutions

With the continuous growth of mobile and IoT devices, streaming data has exploded, and many business scenarios now require real‑time data processing. Traditional offline batch processing platforms can no longer meet the demand for low‑latency handling of massive data streams, prompting the emergence of real‑time stream processing platforms.

The live session featured Tencent Cloud big‑data expert Zou Jianping (Mike), who introduced the latest state of real‑time computing, shared application cases, discussed technical challenges, and explained how Tencent’s big‑data products address these issues.

What is big data? According to Baidu Baike, big data refers to data sets that cannot be captured, managed, or processed within a reasonable time using conventional software tools. It is characterized by the four V’s: Volume, Variety, Velocity, and Value.

Batch processing typically loads data from sources (e.g., HDFS) and runs full‑scale analytics with tools like Hive or Spark, focusing on throughput rather than latency. In contrast, stream processing handles continuously arriving data with low latency, enabling immediate insights.

Typical stream‑processing scenarios include:

QQ real‑time online status monitoring – analyzing massive login logs instantly.

Security – detecting rapid attacks such as ticket‑scalping on 12306.

Financial risk control – credit‑card fraud detection and low‑latency trading.

Internet advertising – real‑time ad placement based on recent user behavior.

IoT and smart transportation – real‑time traffic monitoring and routing.

Key challenges when moving from batch to stream:

Low latency & high throughput: Balancing the two is non‑trivial.

Accuracy: Ensuring exactly‑once semantics despite failures and out‑of‑order events.

Usability: Providing developer‑friendly APIs that hide low‑level complexities.

Stream processing frameworks:

Storm – Twitter’s open‑source distributed stream system. It uses spouts and bolts, provides an ack mechanism for fault tolerance, but suffers from limited throughput, lack of state management, and no exactly‑once guarantee.

Spark Streaming – Implements micro‑batch processing (e.g., 1‑second batches) to achieve high throughput and exactly‑once semantics via RDD checkpointing. However, latency is limited to the batch interval, and native stream language support is weak.

Flink – A newer, fast‑growing stream engine that processes events natively, offers low latency, high throughput, and exactly‑once guarantees through a distributed snapshot (checkpoint) mechanism. It supports flexible windowing (tumbling, sliding, session) and provides a rich API stack: DataStream API, Table API, and SQL.

Flink’s state is stored in a State Backend (heap, RocksDB, or custom). Checkpoints are triggered by inserting a checkpoint barrier into the data stream; each operator snapshots its state asynchronously, enabling fast recovery without stopping the data flow. Exactly‑once is achieved via a two‑phase commit: pre‑commit during checkpointing and final commit after all operators have persisted their state.

Compared with Storm and Spark Streaming, Flink combines low‑latency event processing with high throughput by allowing configurable buffer timeouts, effectively blending stream and batch semantics.

Tencent Cloud’s Stream Computing service is built on Flink, offering a fully managed, elastic, and secure platform that integrates with other Tencent Cloud data products (storage, visualization, etc.). It provides features such as resource isolation, code encryption, fine‑grained monitoring, auto‑scaling, and advanced security checks.

Future directions focus on improving usability (more UDFs, drag‑and‑drop UI), security (dedicated clusters, VPC isolation), and intelligence (CEP, online machine learning).

In summary, as the demand for real‑time analytics grows across internet, finance, advertising, and IoT, stream computing—especially Flink—will play an increasingly critical role. Tencent Cloud’s Flink‑based service aims to provide the best developer experience for building real‑time data pipelines in the cloud.

Cloud ServicesBig DataFlinkstream computingSpark Streamingreal-time data processingApache Storm
Tencent Cloud Developer
Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.