How Uber Tuned GC to Boost Presto Cluster Stability
Uber runs over 20 Presto clusters serving more than 500,000 daily queries, but frequent full GCs and OOMs threatened stability; by analyzing G1GC behavior and adjusting IHOP, heap waste, free space, and young‑gen size on JDK 8 and JDK 11, they cut full GC occurrences by up to 80% and markedly improved overall reliability.
Presto at Uber
Uber runs roughly 20 Presto clusters across two regions, totaling more than 10,000 nodes, with about 12,000 weekly active users and ~500,000 queries per day that read ~100 PB from HDFS. Data sources include Hive, Pinot, AresDb, MySQL, Elasticsearch and Kafka.
Full GC pain points
Weekly memory‑fragmentation mitigation still leaves clusters plagued by long‑duration full garbage‑collection pauses and occasional out‑of‑memory errors. A daily full‑GC count chart illustrates the severity.
G1GC basics
G1GC is a generational collector that partitions the heap into at least 2,048 regions, each 1–32 MB. Regions are classified as young (Eden + survivors), old, or free. Objects are allocated in Eden; surviving objects are copied between survivor regions and promoted to the old generation after reaching an age threshold or when larger than half a region (humongous objects). G1 performs a concurrent mark‑and‑sweep phase with a snapshot‑at‑the‑beginning (STAB) strategy, followed by mixed collections that also reclaim old‑gen regions.
G1GC at Uber
On OpenJDK 8 the only tunable flag was -XX:InitiatingHeapOccupancyPercent=X (default 45%). High‑memory services frequently exceeded this threshold, causing constant concurrent‑mark cycles and high CPU usage.
JDK 11 tuning challenges
Presto migrated to JDK 11, which introduced a dynamic InitiatingHeapOccupancyPercent (IHOP) value that varies at runtime and must be read from GC logs, requiring a different tuning approach.
Tuning workflow
Experiment on a single cluster, waiting 1–2 weeks between changes to collect sufficient metrics.
Enable detailed GC logging and metrics.
Identify peak old‑gen utilization after mixed collections.
Set IHOP slightly above the observed peak (typically +5–10%).
Key adjustments and results
Reduce max young‑gen size from 60% to 20%. This lowered long pauses caused by large young generations, but concurrent marking still started late because more space was given to the old generation.
Increase free‑space target from 10% to 35% and lower heap‑waste percent from 5% to 1%. For a 300 GB heap this freed ~15 GB permanently; full GCs dropped by ~80% on the test cluster.
Raise free space to 40% and heap‑waste to 2%. Added ~9 GB buffer, reducing pause latency from 1–1.5 s to ~50–100 ms; overall improvement modest compared with the 35% setting.
Validation on another cluster
Applying the same configuration to a different cluster eliminated full GCs within 24 hours, confirming the effectiveness of the tuned flags.
Final configuration
After several weeks of testing Uber standardized the following JVM flags for all Presto clusters:
-XX:+UnlockExperimentalVMOptions -XX:G1MaxNewSizePercent=20 -XX:G1ReservePercent=40 -XX:G1HeapWastePercent=2These settings dramatically reduced internal OOM incidents and improved query reliability; per‑workload tuning may still be required.
Future plans
Extend GC tuning to Uber’s storage services, which use larger heaps, and share further findings with the community.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Past Memory Big Data
A popular big-data architecture channel with over 100,000 developers. Publishes articles on Spark, Hadoop, Flink, Kafka and more. Visit the Past Memory Big Data blog at https://www.iteblog.com. Search "Past Memory" on Google or Baidu.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
