Meituan Technology Team
Nov 10, 2022 · Big Data
Optimizing Spark mapPartitions: Memory Management and Best Practices
The article details how Meituan’s Turing machine‑learning platform cut offline resource use by 80% and task time by 63% through memory‑level techniques such as column pruning, adaptive caching, and a deep dive into Spark’s mapPartitions operator, including source‑code analysis, GC behavior, and a low‑memory batch‑iterator best practice.
Big DataMemory OptimizationSpark
0 likes · 19 min read
