Spark Job Execution Principles and Parameter Tuning for Hive on Spark
This article explains how Spark jobs run on YARN, describes the impact of stages, shuffle and task parallelism, and provides detailed recommendations for tuning Spark executor, memory, core, and parallelism settings to dramatically improve Hive‑on‑Spark TPCx‑BB benchmark performance on large datasets.
When running Hive on Spark with the TPCx-BB benchmark on a 100 GB dataset, the author observed that the POWER_TEST phase used only about 10 % of CPU, causing the job to take many hours; therefore, proper parameter tuning is essential to fully utilize cluster resources.
After submitting a Spark job with spark-submit, a Driver process is started (locally or on a cluster node) which requests Executor processes from the cluster manager (YARN in the author's environment). Each Executor receives a configurable amount of memory and CPU cores, and tasks are scheduled across these cores. Spark divides the job into stages based on shuffle operators (e.g., reduceByKey, join); each stage’s tasks may need to fetch data from the previous stage via network shuffle.
The performance of tasks is directly tied to the number of CPU cores allocated to each Executor, because each core can run one thread at a time. Properly sizing executors and cores enables efficient parallel execution.
Key Spark resource parameters and tuning advice are:
num-executors / spark.executor.instances : total number of Executors; typically 50‑100 for a job. Too few underutilizes the cluster, too many may exceed queue limits.
executor-memory / spark.executor.memory : memory per Executor; 4 GB‑8 GB is a common range, but must respect the queue’s total memory limit (avoid exceeding 1/3‑1/2 of the queue’s capacity).
executor-cores / spark.executor.cores : CPU cores per Executor; 2‑4 cores is usually appropriate, again respecting overall queue core limits.
driver-memory : usually left at default or set to ~1 GB; required to be larger if using collect to bring all data to the driver.
spark.default.parallelism : default number of tasks per stage; 500‑1000 tasks works well. A good rule is 2‑3 × (num-executors × executor-cores).
spark.storage.memoryFraction : fraction of Executor memory for cached RDDs (default 0.6). Increase when many RDDs are persisted; decrease if shuffle memory pressure causes GC overhead.
spark.shuffle.memoryFraction : fraction of Executor memory for shuffle aggregation (default 0.2). Decrease when shuffle dominates and memory pressure appears.
In the tuning experiments, with 10 GB of data the author observed that increasing the number of cores per Executor reduced the runtime of query q04 significantly, while other queries improved less.
When scaling to 100 GB and increasing spark.executor.instances, the overall CPU utilization rose to 80‑90 %, and query runtimes dropped dramatically. The final configuration, based on recommendations from the Meituan technical blog, yielded the best performance for the TPCx-BB tests.
Performance charts (described in the original article) show that most queries benefited greatly from tuning, though some small‑table queries (e.g., q01, q04, q14) showed little change, and queries q10 and q18 still require further investigation.
Example configuration commands:
set hive.execution.engine=spark;</code>
<code>set spark.executor.memory=4g;</code>
<code>set spark.executor.cores=2;</code>
<code>set spark.executor.instances=40;</code>
<code>set spark.serializer=org.apache.spark.serializer.KryoSerializer;Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
