Unveiling Go’s Goroutine: Inside the M‑P‑G Scheduler and Context Switching
This article explores Go's goroutine implementation by dissecting the M‑P‑G model, scheduling strategies, preemption mechanisms, and low‑level context‑switch details, complemented with code snippets and assembly examples to illustrate how lightweight user‑level threads achieve high concurrency.
Explanation
Based on Go 1.11 source code.
The language evolves quickly; the article focuses on core concepts rather than exhaustive source reading.
Go provides debugging tools such as go tool compile, go tool objdump, and go run -gcflags "all=-N -l".
Coroutine Concept and Characteristics
Wikipedia defines a coroutine as a component that enables non‑preemptive multitasking, allowing execution to be suspended and resumed. Synchronous code blocks while waiting for I/O, whereas asynchronous callbacks avoid blocking but complicate control flow. Goroutines combine asynchronous performance with synchronous‑style code.
// Synchronous mode
res = request(...)
// block until result // Asynchronous mode (epoll/select)
request(function(res) {
// callback, no thread switch
doSomething(res)
})
doOtherThings()Goroutine acts like a user‑space thread; the scheduler controls switching without kernel involvement.
// Goroutine approach
res = request(...)
// yield control without thread/process switchAdvantages: lower switch cost, fewer instructions, non‑preemptive and flexible scheduling.
Lower switch cost because it runs in user space.
Only a few extra instructions compared to kernel threads.
Non‑preemptive, allowing flexible scheduling.
Disadvantages:
Manual stack and register management.
Performance depends on scheduling strategy and fairness.
Typically runs on a single core per goroutine, limiting multi‑core utilization.
Go Goroutine (goroutine)
Traditional coroutines follow a 1:N model (one thread, many coroutines) and cannot fully exploit multi‑core CPUs. Go uses an M:N model: M OS threads, N goroutines, with the scheduler assigning goroutines to threads transparently, enabling true parallelism.
MPG Model
M – Machine (real OS thread).
G – Goroutine (lightweight coroutine).
P – Processor (holds runnable Gs; M obtains a P to run a G).
Key relationships illustrated by diagrams (images omitted for brevity).
What Happens When go fn() Is Called
The runtime creates a new goroutine, places it in a queue, and later a thread (M) picks it up for execution. The function is not run immediately and its order is not guaranteed.
func main() {
runtime.GOMAXPROCS(1)
go printInt(1)
go printInt(2)
go printInt(3)
go printInt(4)
// output may be: 4 1 2 3
}Note: actual order varies due to multiple Ms and preemption.
When a Goroutine Yields Control
Two ways:
Explicit : during system calls, network I/O, channel block, or when the function returns.
Implicit : preemption, garbage‑collection, stack growth, etc.
System Call Example
// Simplified syscall entry
TEXT ·Syscall(SB),NOSPLIT,$0-56
CALL runtime·entersyscall(SB)
// ... syscall ...
CALL runtime·exitsyscall(SB)
RETScheduling
Kernel threads are pre‑emptively scheduled by the OS, while goroutine scheduling is cooperative and driven by the Go runtime.
Goal: find a runnable G for each M.
Scheduling triggers: creation/wakeup of M, G completion, park, etc.
Basic strategy: prioritize GC Gs, periodically pull from global queue, check local P queues, steal work from other Ps, and finally put M to sleep if no G is available.
Preemption
Preemption ensures fairness when a long‑running G monopolizes an M. The runtime marks a G for preemption; the next time the G checks the stack, it yields.
// Preemptive check in compiled code
if sp < stackguard0 { // preempt flag set
// trigger preemption
}Examples show both failed and successful preemption depending on whether the function contains a stack‑check instruction.
Implementation Details
Goroutine context consists of PC, BP, SP, and a few general registers. Go stores these in the gobuf structure inside the g struct.
type g struct {
sched gobuf
// ...
}
type gobuf struct {
sp uintptr
pc uintptr
bp uintptr
// other fields omitted
}Saving context:
MOVQ SI, (g_sched+gobuf_pc)(AX) // save PC
MOVQ SP, (g_sched+gobuf_sp)(AX) // save SP
MOVQ BP, (g_sched+gobuf_bp)(AX) // save BPRestoring context:
MOVQ (gobuf_sp)(BX), SP
MOVQ (gobuf_bp)(BX), BP
MOVQ (gobuf_pc)(BX), BX
JMP BX // jump to saved PCStack growth is handled by checking stackguard0 and calling runtime.morestack_noctxt when needed.
// Stack guard check
CMPQ AX, 16(CX) // compare SP‑32 with stackguard0
JLS 311 // if below, jump to grow stackConclusion
The core ideas behind Go's goroutine implementation are simple: a lightweight user‑space thread managed by a scheduler that tracks PC, SP, and BP, with preemption and stack‑growth mechanisms ensuring efficient concurrency. Understanding these concepts opens the door to deeper runtime exploration or even building custom coroutines in other languages.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
