Tencent's Oceanus Real-Time Stream Computing Platform and Flink Optimizations
The article presents Tencent's evolution of real‑time stream processing using Flink, the design of the Oceanus one‑stop visual platform, and a series of deep extensions and optimizations—including UI redesign, JobManager failover, checkpoint handling, enhanced windows, LocalKeyBy, idle detection, and log isolation—aimed at supporting petabyte‑scale data workloads.
In a QCon 2019 talk, Tencent senior big‑data engineer Yang Hua describes how Tencent adopted Apache Flink in 2017 to replace Storm for real‑time analytics, highlighting Storm's limitations such as lack of state, fault tolerance, and exactly‑once guarantees.
After initial trials, Tencent built the Oceanus platform, initially on Flink standalone and later on Flink on YARN, providing a unified development, testing, deployment, and operations environment that now handles peaks of 2.1 × 10⁸ events per second and processes roughly 20 trillion messages daily.
Oceanus offers a visual canvas, SQL, and Jar‑based application development, integrated configuration, testing, and deployment tools, and domain‑specific services like ETL, monitoring, and recommendation, with examples ranging from WeChat feed statistics to payment transaction aggregation.
The team has made extensive customizations to the community Flink version, including a redesigned Web UI with richer metrics, a hot‑standby JobManager failover that avoids full job restarts, a unified checkpoint‑failure manager with configurable tolerance, and enhanced window operators that support arbitrary‑delay events and incremental triggers.
Additional optimizations include LocalKeyBy for hotspot mitigation, idle‑detection in watermark operators, and a log‑isolation mechanism that gives each job its own logging classloader and configuration, improving debugging in multi‑job Standalone deployments.
These innovations have been partially contributed back to the Flink community, and Tencent invites contributors to help tackle trillion‑scale data challenges.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
