Understanding Go's Goroutine Scheduler and the GMP Model
This article explains how Go implements its own goroutine scheduler within the runtime, demonstrates it with sample code, and details the GMP (Goroutine‑M‑Processor) model, its scheduling strategies, and the evolution from the earlier GM model.
Introduction
To achieve efficient execution and scheduling of its goroutines, Go implements a dedicated scheduler inside its runtime. The following simple code snippet shows how many goroutines are present at runtime, helping readers understand the scheduler’s behavior.
The Go scheduler is part of the Go runtime, and the Go runtime is built into your application.
for i := 0; i < 4; i++ {
go func() {
time.Sleep(time.Second)
}()
}
fmt.Println(runtime.NumGoroutine())The output of the above code is 5 , indicating that the program currently has five goroutines (the main goroutine plus four spawned ones). These user‑level goroutines cannot directly use OS resources; they must be mapped onto kernel‑level threads, following one of three threading models (1:1, N:1, M:N). How the five goroutines are dispatched onto kernel threads is decided by Go’s scheduler.
GMP Model
The goroutine scheduler is built on a three‑layer GMP model.
G : goroutine.
M : kernel‑level thread that runs in the OS kernel. Go can support up to 10,000 M’s, though most operating systems cannot create that many threads.
P : processor, essentially a queue of ready goroutines waiting to be assigned to an M. The number of P’s is determined by runtime.GOMAXPROCS .
M and P have a one‑to‑one binding relationship. The following diagram illustrates the structure:
The Journey of a Goroutine
When go func(){} is executed, the GMP model works as follows (illustrated by the diagram below).
Create a new goroutine.
If the local run‑queue has space, place the goroutine there; otherwise, enqueue it in the global queue (accessible by all M’s).
Every G must run on an M. M is bound to a P; if the bound P has ready G’s, M pulls a G from that P. If the P is empty, M pulls from the global queue; if that is also empty, it steals from another P.
Allocate necessary resources and wait for CPU scheduling.
Once scheduled on a CPU, the goroutine executes its function body.
Scheduling Strategies
The most important strategy of the goroutine scheduler is reuse: avoiding frequent creation and destruction of resources to maximize throughput and concurrency, which aligns with the ultimate goal of OS thread scheduling. Reuse underpins many pooling techniques.
Based on this principle, the scheduler optimizes in several ways:
Work‑queue stealing: similar to Java’s ForkJoinPool, idle M’s steal G’s from busy P’s instead of destroying idle M’s, reducing the overhead of thread creation/destruction.
Hand‑off mechanism: when an M blocks, it hands its P over to another idle M.
Since Go 1.14, the scheduler also includes pre‑emptive capabilities (see issue 24543 ), which address two problems:
Prevent long‑running CPU‑bound G’s from starving other G’s bound to the same P (the “hunger” problem).
Help the garbage collector avoid deadlocks during GC pauses (see discussion in issue 10958 ).
From the Early GM Model to GMP
In early versions of Go, the scheduler used a GM model: a single global G queue from which all M’s pulled work. Go 1.1 replaced this with the current GMP model, adding per‑P local queues. The change was driven by three reasons:
The global queue required a large lock, limiting concurrency.
Without local queues, an M blocked on I/O would also block its G, preventing other M’s from executing that G; the GMP model allows G’s to be handed off to other M’s, improving I/O‑bound performance.
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.