Design and Implementation of a Serverless Data Filling Engine for UnifiedPB in Ctrip Hotel Recommendation System
This article describes how Ctrip's hotel recommendation team built a serverless, configuration‑driven data‑filling engine based on UnifiedPB protobuf schemas to improve development efficiency, reduce cost, ensure data quality, and achieve unified three‑region data delivery across more than twenty recommendation scenarios.
Background and Motivation Ctrip’s hotel ranking recommendation system introduced a unified data protocol (UnifiedPB) built on protobuf to standardize online, near‑line, and offline data streams across user, item, user‑item, and common dictionaries. While UnifiedPB solved many data‑integration issues, the data‑filling process still suffered from low iteration speed, poor reuse, opaque coupling, uncontrolled cost, and inconsistent data across domestic, overseas, and IBU environments.
Problems Identified The team listed five major pain points: (1) slow iteration due to case‑by‑case hard‑coded development (e.g., 60 features required eight days); (2) low reuse across three‑ends and scenarios; (3) tightly coupled and non‑transparent strategy‑data logic; (4) uncontrolled data version lifecycle leading to redundant storage; (5) lack of unified data for the three recommendation ends.
Solution Overview Leveraging a serverless mindset, a filling‑engine framework was designed to isolate logic from resources. The framework consists of three core modules—Bin, Conf, and Data—driven by configuration files stored in an internal qconfig system. Strategy developers write low‑cost SQL logic, while engineers maintain the framework, enabling one‑stop, automated data filling for UnifiedPB.
Implementation Details The Bin module includes CodeGen, DataLoad, Transform, and ConfListen components, using bytecode and reflection to generate dynamic classes at initialization. The Data module produces feature data from strategy code, and the Conf module acts as a bridge, automatically generating configuration files via a web UI. The architecture ensures that changes propagate through configuration updates, supporting real‑time class construction and seamless data filling.
Quality Assurance A comprehensive pipeline guarantees safe releases: automated tests (rule validation, unit test cases, batch compatibility tests, performance tests) before deployment; monitoring of service performance and data metrics during and after release; and a one‑click rollback mechanism. All changes are gray‑released, observable, and reversible.
Visualization Platform A web‑based platform provides three roles—Strategy, Engineering, and Platform. Strategy users define data sources, import features, write SQL, and set primary keys; Engineering users review, gray‑release, and run the pipeline; the Platform enforces standards and visualizes the workflow.
Results and Impact The engine has been rolled out to over twenty hotel recommendation scenarios across three ends, achieving 100% coverage. Iteration time for new features dropped from days/weeks to hours, improving efficiency by more than 90%. Feature‑data switching that previously took up to 15 days now completes within a day. Unified data across three ends eliminates duplication, reduces storage costs, and makes logic transparent for faster troubleshooting.
Conclusion and Outlook After a year of development, the filling‑engine provides a robust, low‑code, automated pipeline that accelerates feature delivery, controls costs, and ensures data consistency. Future work will extend the platform to other business units and continue optimizing performance, cost, and scalability.
Ctrip Technology
Official Ctrip Technology account, sharing and discussing growth.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.