Design and Evolution of Lianjia's Big Data Platform: Architecture, Challenges, and Solutions
This article details Lianjia's journey from a Hadoop‑based 0.0 data platform to a sophisticated 2.0 architecture, describing the three‑layer design, OLAP engine choices, transparent compression techniques, operational challenges, and practical recommendations for building and maintaining large‑scale big data systems.
Zhao Guoxian, leader of the Lianjia big data architecture team, introduces the evolution of the company's data platform from its initial Hadoop‑centric 0.0 version to the current 2.0 architecture, highlighting the need for performance optimization, distributed storage, and real‑time processing.
The platform is organized into three layers: the cluster layer (Hadoop, YARN, Spark, Presto, HBase, Oozie) provides distributed storage, resource scheduling, and compute engines; the tool‑chain layer features a self‑developed scheduler, metadata management (Meta), and an intelligent query engine that selects the most suitable engine (Presto, SparkSQL, Hive) based on SQL analysis; the API layer abstracts data access for internal analytics, business services, and generic consumption.
Key challenges encountered include tightly coupled architecture, long development cycles driven by ad‑hoc demands, frequent failures in Hive/SQL jobs, and the difficulty of treating big‑data engineers merely as data‑retrieval specialists. To address these, Lianjia introduced a unified scheduling system, dependency visualization, and a middleware that routes queries to the optimal engine.
For OLAP processing, the article compares ROLAP (real‑time aggregation on raw data) and MOLAP (pre‑computed cubes). After evaluating options, the team selected Apache Kylin for its high concurrency and sub‑second query performance on billions of rows, complemented by Druid for real‑time ingestion and a hybrid OLAP approach that routes queries to the appropriate engine.
To curb rapid data growth and storage costs, a transparent compression strategy was implemented. Cold data is migrated from HDFS to a ZFS file system using gzip compression, while hot data remains on SSD or traditional disks. The solution includes hot‑cold tiering, ZFS features (ARC/L2ARC), and a migration workflow that periodically moves identified cold data.
Future work aims to offload compression to hardware accelerators (QAT), combine erasure coding with compression for higher reliability, and implement intelligent hot‑data warming using SSD caches. The article concludes with practical advice: perform thorough requirement analysis and technology selection, maintain stable iterative development, prioritize monitoring, and continuously optimize online performance.
Beike Product & Technology
As Beike's official product and technology account, we are committed to building a platform for sharing Beike's product and technology insights, targeting internet/O2O developers and product professionals. We share high-quality original articles, tech salon events, and recruitment information weekly. Welcome to follow us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.