Design and Implementation of a Flink‑Based Real‑Time Data Platform at Autohome
This article describes how Autohome migrated its real‑time analytics from Storm to a Flink‑SQL platform, detailing the architectural design, development and operational advantages, practical use cases such as recommendation metrics, and future plans for ecosystem expansion and open‑source release.
Before 2019, most of Autohome's real‑time services ran on Storm, but growing data volume and the need to translate large offline SQL jobs into Storm code created high development and maintenance costs, as well as heavy reliance on external stores like Redis.
Starting in 2018, the team evaluated Flink and, in early 2019, built a Flink‑SQL platform that now supports over 120 online jobs across data‑warehouse, monitoring, and testing scenarios, delivering million‑level QPS with low latency.
The platform offers several key benefits: (1) low development cost by defining sources and sinks as tables via DDL and using SQL + UDF; (2) high performance because Flink manages state in memory, eliminating Redis bottlenecks; (3) low maintenance thanks to readable SQL; (4) native data‑lineage tracking through parsed SQL and DDL; and (5) support for layered data‑warehouse models.
Core architectural components include table management (abstracting Kafka, MySQL, Elasticsearch, Redis, HTTP as relational tables), task configuration (job, launch, and cluster configs for per‑job Flink clusters), multi‑tenant permission control at job and table levels, limited UDF management, resource scheduling on YARN with a future shift toward Kubernetes, log collection via a customized Flume Log4j appender, and comprehensive monitoring and alerting using Prometheus push‑gateway and Grafana dashboards.
Practical usage is illustrated by a real‑time recommendation metric system that ingests JSON user‑behavior logs from Kafka, cleanses them into wide tables, and computes per‑item exposure and click counts using simple SQL statements, with results written to various sinks.
The conclusion highlights that the platform enables rapid, accurate real‑time metric development with minimal learning curve, solving the previous pain points of SQL‑to‑code translation and heavy external‑store dependence.
Future plans include expanding wide tables with more business partners, integrating additional storage systems, open‑sourcing the platform for companies lacking real‑time expertise, and scaling the team.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
