Migrating Spark Offline Computing to Kubernetes: Architecture, Optimizations, and Lessons Learned
Youzan migrated its large‑scale offline Spark workloads from YARN to a cloud‑native Kubernetes architecture, separating storage and compute with Ceph FS, adding dynamic executor allocation and remote shuffle services, and applying numerous Spark and deployment tweaks that yielded elastic scaling, higher resource utilization, reduced costs, and valuable operational lessons.
