Big Data 11 min read

Beike's Hermes Real‑Time Computing Platform: Architecture, Scale, and Future Roadmap

The article presents a comprehensive case study of Beike's Hermes real‑time computing platform, detailing its business evolution, Hermes architecture, SQL V1/V2 editors built on Spark and Flink, large‑scale deployment statistics, monitoring, diverse business use cases, and planned future enhancements.

Big Data Technology Architecture

Feb 1, 2020

Beike's Hermes Real‑Time Computing Platform: Architecture, Scale, and Future Roadmap

The paper, presented by Liu Liyun, head of real‑time computing at Beike, introduces the Hermes platform that powers real‑time data processing for Beike's four core businesses: second‑hand housing, new housing, leasing, and decoration.

It outlines the business growth from the 2018 DP real‑time data bus to the development of Hermes, a unified task‑management platform that initially used Spark Structured Streaming (SQL V1) and later migrated to Flink (SQL V2) to support richer SQL syntax and custom functions.

Current deployment statistics show Hermes supporting over 30 projects, 400 streaming tasks, processing up to 800 billion messages daily with an average latency of about 40 ms, and handling more than 1 trillion records per day.

The platform offers multi‑language task development (Java, Scala, Python), resource isolation per project, and a public queue for low‑resource tasks, along with comprehensive monitoring and alerting capabilities.

Hermes architecture consists of four layers: a computation engine layer (Flink and Spark Streaming), a functional component layer (task, project, and data‑source management), and support for StreamSQL, DataStream, and StreamCEP, enabling features such as snapshot rollback.

Two visual SQL editors are described: SQL V1 (Spark‑based) with drag‑and‑drop UI and support for Kafka/Druid sinks, and SQL V2 (Flink‑based) offering source, sink, and dimension tables, automatic DDL generation, syntax checking, and task debugging.

The article also details the real‑time data warehouse built on top of SQL V2, its Kafka‑to‑Hive data flow, and the ability to query and analyze fresh data for business insights.

Several business cases are highlighted, including a real‑time transaction dashboard, broker itinerary monitoring, and real‑time user profiling that feeds recommendation engines.

Monitoring and alerting are implemented via custom listeners that collect Spark/Flink metrics, forward them to Kafka, and trigger latency or heartbeat alerts through Hermes.

Future development plans focus on dynamic resource allocation, event‑driven processing, a unified user data platform for real‑time analytics, and exploring Kappa architecture to unify stream and batch processing.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Apache Flink Beike Hermes platform

Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.