Tagged articles
2 articles
Page 1 of 1
Meituan Technology Team
Meituan Technology Team
Nov 10, 2022 · Big Data

Optimizing Spark mapPartitions: Memory Management and Best Practices

The article details how Meituan’s Turing machine‑learning platform cut offline resource use by 80% and task time by 63% through memory‑level techniques such as column pruning, adaptive caching, and a deep dive into Spark’s mapPartitions operator, including source‑code analysis, GC behavior, and a low‑memory batch‑iterator best practice.

Big DataMemory OptimizationSpark
0 likes · 19 min read
Optimizing Spark mapPartitions: Memory Management and Best Practices