vivo Internet Technology
Dec 13, 2023 · Big Data
Hudi Data Lake Implementation and Optimization Practice at vivo
Vivo’s big‑data team deployed Apache Hudi to create a lakehouse that unifies streaming and batch workloads, leverages COW and MOR storage modes, automates small‑file clustering and compaction, and applies extensive version, streaming, batch, and lifecycle optimizations, delivering minute‑level latency, hundred‑million‑records‑per‑minute ingestion, and query speeds up to 20 % faster than Hive.
Apache HudiBatch ProcessingBig Data
0 likes · 11 min read