Backend Development 14 min read

Understanding Go's Goroutine Scheduling: Design Principles, GMP Model, and Optimizations

The article reviews Dmitry Vyukov's 2019 talk on Go's goroutine scheduler, explains the GMP (goroutine‑M‑Processor) model, walks through its evolution from naive thread‑per‑goroutine to thread pools and work‑stealing, and discusses fairness, pre‑emptive scheduling, and possible future improvements.

High Availability Architecture
High Availability Architecture
High Availability Architecture
Understanding Go's Goroutine Scheduling: Design Principles, GMP Model, and Optimizations

The article introduces Dmitry Vyukov’s 2019 talk on the design of Go’s goroutine scheduler and presents the author’s reflections on the software design ideas behind the GMP (Goroutine‑M‑Processor) model.

Design Goals – Create a high‑efficiency concurrent programming model where a single go keyword can launch a goroutine, achieving both development and runtime efficiency, with unbounded stack size and fair scheduling.

From Zero to Multi‑Threading – Starting with the naive idea that each goroutine maps to a thread, the article shows why this approach does not scale due to massive thread‑creation overhead.

Thread‑Pool Solution – By limiting the number of OS threads (N) and using a global run queue, threads pull goroutines to execute, but this introduces contention on the global queue when many threads compete.

Thread‑Local Queues – To reduce contention, each thread maintains a local run queue (LRQ). When a LRQ is empty, the thread steals work from others, but stealing still incurs mutex overhead and may be wasteful on many cores.

Decoupling Resources from Threads – Introduce a Processor abstraction that owns its LRQ and related storage, achieving the classic GMP model where the number of Processors (P) is independent of the number of OS threads.

Fairness and Pre‑emptive Scheduling – Discusses the need for time‑slice based pre‑emptive scheduling to prevent long‑running goroutines from monopolizing CPU, comparing signal‑based interruption with cooperative checks, and explains why cooperative checks are preferred in Go’s runtime.

The article also lists several design takeaways such as thread pools, resource pools, compute‑storage separation, and the use of interrupts versus polling.

Further Optimizations – Highlights open problems like work‑steal overhead on many cores, edge cases where a goroutine contains no cooperative checks, and handling of network/timer goroutines that currently use a global queue.

Additional discussion covers a simple C++ coroutine framework used in the author’s game server, contrasting its single‑threaded scheduler, fixed‑size stacks, and hand‑off strategy with Go’s more sophisticated runtime.

The author, Wu Lianhuo, is a senior engineer at Tencent Games, leading large‑scale distributed server architecture and cloud‑native transformation.

References to related articles on distributed databases, Dubbo, Rust, and concurrency pitfalls are provided at the end.

concurrencyGoSchedulerRuntimethread poolGMP modelgoroutine
High Availability Architecture
Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.