Fundamentals 23 min read

Unveiling Go’s Goroutine: Inside the M‑P‑G Scheduler and Context Switching

This article explores Go's goroutine implementation by dissecting the M‑P‑G model, scheduling strategies, preemption mechanisms, and low‑level context‑switch details, complemented with code snippets and assembly examples to illustrate how lightweight user‑level threads achieve high concurrency.

GF Securities FinTech

May 14, 2019

Unveiling Go’s Goroutine: Inside the M‑P‑G Scheduler and Context Switching

Explanation

Based on Go 1.11 source code.

The language evolves quickly; the article focuses on core concepts rather than exhaustive source reading.

Go provides debugging tools such as go tool compile, go tool objdump, and go run -gcflags "all=-N -l".

Coroutine Concept and Characteristics

Wikipedia defines a coroutine as a component that enables non‑preemptive multitasking, allowing execution to be suspended and resumed. Synchronous code blocks while waiting for I/O, whereas asynchronous callbacks avoid blocking but complicate control flow. Goroutines combine asynchronous performance with synchronous‑style code.

// Synchronous mode
res = request(...)
// block until result

// Asynchronous mode (epoll/select)
request(function(res) {
    // callback, no thread switch
    doSomething(res)
})
doOtherThings()

Goroutine acts like a user‑space thread; the scheduler controls switching without kernel involvement.

// Goroutine approach
res = request(...)
// yield control without thread/process switch

Advantages: lower switch cost, fewer instructions, non‑preemptive and flexible scheduling.

Lower switch cost because it runs in user space.

Only a few extra instructions compared to kernel threads.

Non‑preemptive, allowing flexible scheduling.

Disadvantages:

Manual stack and register management.

Performance depends on scheduling strategy and fairness.

Typically runs on a single core per goroutine, limiting multi‑core utilization.

Go Goroutine (goroutine)

Traditional coroutines follow a 1:N model (one thread, many coroutines) and cannot fully exploit multi‑core CPUs. Go uses an M:N model: M OS threads, N goroutines, with the scheduler assigning goroutines to threads transparently, enabling true parallelism.

MPG Model

M – Machine (real OS thread).

G – Goroutine (lightweight coroutine).

P – Processor (holds runnable Gs; M obtains a P to run a G).

Key relationships illustrated by diagrams (images omitted for brevity).

What Happens When go fn() Is Called

The runtime creates a new goroutine, places it in a queue, and later a thread (M) picks it up for execution. The function is not run immediately and its order is not guaranteed.

func main() {
    runtime.GOMAXPROCS(1)
    go printInt(1)
    go printInt(2)
    go printInt(3)
    go printInt(4)
    // output may be: 4 1 2 3
}

Note: actual order varies due to multiple Ms and preemption.

When a Goroutine Yields Control

Two ways:

Explicit : during system calls, network I/O, channel block, or when the function returns.

Implicit : preemption, garbage‑collection, stack growth, etc.

System Call Example

// Simplified syscall entry
TEXT ·Syscall(SB),NOSPLIT,$0-56
    CALL runtime·entersyscall(SB)
    // ... syscall ...
    CALL runtime·exitsyscall(SB)
    RET

Scheduling

Kernel threads are pre‑emptively scheduled by the OS, while goroutine scheduling is cooperative and driven by the Go runtime.

Goal: find a runnable G for each M.

Scheduling triggers: creation/wakeup of M, G completion, park, etc.

Basic strategy: prioritize GC Gs, periodically pull from global queue, check local P queues, steal work from other Ps, and finally put M to sleep if no G is available.

Preemption

Preemption ensures fairness when a long‑running G monopolizes an M. The runtime marks a G for preemption; the next time the G checks the stack, it yields.

// Preemptive check in compiled code
if sp < stackguard0 { // preempt flag set
    // trigger preemption
}

Examples show both failed and successful preemption depending on whether the function contains a stack‑check instruction.

Implementation Details

Goroutine context consists of PC, BP, SP, and a few general registers. Go stores these in the gobuf structure inside the g struct.

type g struct {
    sched gobuf
    // ...
}

type gobuf struct {
    sp   uintptr
    pc   uintptr
    bp   uintptr
    // other fields omitted
}

Saving context:

MOVQ SI, (g_sched+gobuf_pc)(AX)   // save PC
MOVQ SP, (g_sched+gobuf_sp)(AX)   // save SP
MOVQ BP, (g_sched+gobuf_bp)(AX)   // save BP

Restoring context:

MOVQ (gobuf_sp)(BX), SP
MOVQ (gobuf_bp)(BX), BP
MOVQ (gobuf_pc)(BX), BX
JMP BX   // jump to saved PC

Stack growth is handled by checking stackguard0 and calling runtime.morestack_noctxt when needed.

// Stack guard check
CMPQ AX, 16(CX)   // compare SP‑32 with stackguard0
JLS 311            // if below, jump to grow stack

Conclusion

The core ideas behind Go's goroutine implementation are simple: a lightweight user‑space thread managed by a scheduler that tracks PC, SP, and BP, with preemption and stack‑growth mechanisms ensuring efficient concurrency. Understanding these concepts opens the door to deeper runtime exploration or even building custom coroutines in other languages.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Concurrency Golang scheduler Runtime Goroutine context switch M:N model

Written by

GF Securities FinTech

Dedicated to sharing the hottest FinTech practices

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.