Data Serviceization at JD: From Zero to One and Beyond
This article presents JD's data service platform, describing its origin, performance optimizations, flexible API generation, caching strategies, service orchestration, and governance, and includes a Q&A that addresses security, performance, and multi‑source data handling challenges.
Introduction – JD's Data Intelligence Department built a data service platform (EZD framework) to let engineers generate open data APIs by simply providing SQL, dramatically reducing development cycles for high‑frequency business demands such as the 618 promotion.
1. Origin: Data Serviceization from 0 to 1 – Business units required fast, open data APIs; the traditional two‑week development cycle was too slow. The EZD framework enables one‑click API generation from SQL while ensuring performance and dynamic parameters.
The solution replaces the traditional fixed API development model: data sources are read via JDBC, then exposed via HTTP or RPC. Engineers fill in SQL, click publish, and the system hot‑deploys the API.
2. Interface Performance – Early versions suffered from high latency due to repeated SQL look‑ups. Caching SQL definitions in an in‑memory routing table and switching to the Hikari connection pool reduced the platform’s time‑consumption from 97% to 1%.
3. Interface Flexibility – Parameters can be injected into SQL using colon syntax; dynamic query conditions are handled via FreeMarker templates (IF, SWITCH, loops), reducing 80 APIs to 5.
4. From 1 to 10: Scaling the Platform – During JD’s 618 promotion, dozens of metrics were served via the data APIs. The platform supports rapid indicator development, one‑click cache addition, and integration with various storage systems (Elasticsearch‑SQL, Redis, HBase).
4.1 NoSQL Storage API Generation – Elasticsearch‑SQL executes SQL on ES; Redis supports KV reads/writes; HBase supports get/scan operations.
4.2 One‑Click Caching – Two cache modes: passive (created on first request) and active (periodic refresh). Passive caching handles dynamic parameters but may cause QPS spikes; active caching eliminates spikes by pre‑loading data.
5. Service Orchestration – Complex business logic is encapsulated in workflow‑driven APIs, allowing conditional branching, debugging, and automatic execution without manual intervention.
6. Governance of Data Services – Data services involve producers, consumers, and governance. A service market layers services (entity, interaction, application). Governance enforces policies, service classification, quality control, and release checkpoints.
Q&A
Q1: Does high flexibility affect security or performance? A1: Data sources have owners who authorize API groups; performance bottlenecks lie in the underlying databases. The platform provides distributed rate‑limiting and circuit‑breaking to protect stability.
Q2: Can a MySQL‑based API be switched to ClickHouse or another SQL source? A2: Yes, if the SQL is compatible with the target engine; otherwise, conversion is needed.
Thank you for attending; follow the DataFunTalk channel for more technical content.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.