How JD’s Data Lake Uses Hudi LSM‑Tree to Power Near‑Real‑Time Data Assets
The article details JD’s data lake architecture, its 500 PB scale, self‑developed Hudi extensions—including LSM‑Tree‑based MoR tables, custom indexing, IO optimizations, Flink stream scheduling, and NativeIO SDK—along with benchmarks, community contributions, and future roadmap for real‑time big‑data processing.
