Databases 13 min read

Avoiding OOM in Greenplum: Memory Configuration and Resource Queue Best Practices

This article explains why Greenplum can encounter out‑of‑memory (OOM) errors, lists the typical causes, and provides detailed recommendations for memory‑related OS and database parameters, statement_mem settings, and the proper use of Greenplum resource queues to achieve stable and efficient cluster operation.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Avoiding OOM in Greenplum: Memory Configuration and Resource Queue Best Practices

Background: Greenplum (GPDB) is a massively parallel processing (MPP) database that consumes resources proportionally to the hardware available; while this speeds up processing, it also makes the system prone to resource exhaustion, especially memory‑related OOM failures that can crash the database.

How to Avoid OOM

OOM occurs when a process cannot allocate the requested memory. Typical scenarios include:

Insufficient memory on a database node.

Improper OS kernel memory parameters.

Data skew causing a single SEGMENT to request excessive memory.

Query skew (e.g., non‑distribution‑key grouping) leading to large memory demands on a SEGMENT.

Mitigation steps:

Rewrite queries to reduce memory consumption.

Use resource queues to limit concurrent queries and overall memory usage.

Deploy fewer SEGMENTs per host (e.g., 8 instead of 16 on a 128 GB machine).

Increase physical memory per host.

Set gp_vmem_protect_limit to cap per‑SEGMENT VMEM.

Configure statement_mem at the session level to bound memory per SQL statement.

Optionally set statement_mem at the database level.

Use resource queues to enforce a memory ceiling for a group of users.

Memory‑Related Parameter Configuration

Proper OS and database settings dramatically lower OOM risk:

Do not enable huge pages; Greenplum’s PostgreSQL version does not support them.

Set vm.overcommit_memory to 2 when swap is used, otherwise to 0.

Adjust overcommit_ratio (larger values allow more user allocations, but reduce OS reserve).

Key database parameters:

gp_vmem_protect_limit : maximum VMEM a SEGMENT may allocate.

runaway_detector_activation_percent (default 90%): terminates queries that exceed the percentage of gp_vmem_protect_limit .

statement_mem (default 125 MB): per‑SQL memory cap; excess memory spills to disk.

gp_workfile_limit_files_per_query : max number of spill files per query (0 = unlimited).

gp_workfile_compress_algorithm : compression for spill files (none or zlib).

Example calculations for gp_vmem , gp_vmem_protect_limit , and vm.overcommit_ratio are illustrated with the following diagrams:

Using Resource Queues

Greenplum resource queues limit concurrent queries and total memory usage, similar to Linux cgroups.

Creating a resource queue (syntax example):

ACTIVE_STATEMENTS : max concurrent active SQLs (‑1 = unlimited).

MEMORY_LIMIT 'memory_units KB, MB or GB' : max memory per queue (‑1 = unlimited, still subject to DB/OS limits).

Other parameters such as gp_resqueue_memory_policy , MAX_COST , COST_OVERCOMMIT , MIN_COST , and PRIORITY control cost‑based throttling and CPU share.

Modifying a queue’s limits:

Assigning users to a queue:

Key queue‑related parameters:

gp_resqueue_memory_policy : memory management strategy (none, auto, eager_free).

gp_resqueue_priority : enable/disable priority handling.

gp_resqueue_priority_cpucores_per_segment : CPU cores allocated per SEGMENT.

gp_resqueue_priority_sweeper_interval : CPU time‑slice statistics interval.

Recommended usage:

Do not rely on the default pg_default queue; create a dedicated queue per user/business.

Avoid using superuser accounts for regular workloads, as they bypass queue limits.

Set ACTIVE_STATEMENTS and MEMORY_LIMIT according to workload size.

Prioritize queues (e.g., HIGH for reporting, MEDIUM for normal queries, LOW for background jobs).

Adjust queue limits dynamically via cron jobs if workload patterns change.

Monitor queue usage with gp_toolkit views.

Author Introduction

digoal (Zhou Zhengzhong) is an Alibaba Cloud Database Kernel Architect, a founding member and standing committee of the PostgreSQL China Community, and a DBA community co‑founder.

performanceMemory ManagementoomGreenplumDatabase ConfigurationResource Queue
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.