Fundamentals 46 min read

In-depth Analysis of Go Language Scheduler: G, M, P Concepts and Scheduling Loop

The article walks through Go’s 1.9.2 scheduler on Linux, explaining how the three abstractions—G (goroutine), M (OS thread), and P (processor)—are created, bound, and used in the scheduling loop to run, yield, and balance goroutines via local, global, and stolen queues.

Didi Tech
Didi Tech
Didi Tech
In-depth Analysis of Go Language Scheduler: G, M, P Concepts and Scheduling Loop

The article provides a source‑code level walkthrough of the Go scheduler (based on Go 1.9.2) focusing on the Linux implementation. It introduces the three core abstractions: G (goroutine), M (kernel thread), and P (processor), explaining how they interact to schedule user goroutines.

During process startup, runtime.osinit determines the number of CPU cores, runtime.schedinit creates the initial M0 and G0, sets up the maximum M count, and initializes a number of P objects according to GOMAXPROCS or the detected core count. The runtime·mainPC function then launches the main goroutine and starts the first M, which enters the scheduling loop.

The scheduling loop repeatedly binds an M to a P, obtains a runnable G from the P’s local run queue, the global run queue, or by stealing from other Ps, executes the G via the gogo assembly routine, and upon completion returns the G to a free list via goexit, which calls schedule again to pick the next G.

A G can yield the CPU in several ways: normal termination, active yielding through constructs such as time.Sleep, mutex locks, or channel operations (all of which eventually call gopark to transition the G to _Gwaiting and invoke schedule), forced preemption by the sysmon background thread that sets a stack‑guard flag to trigger a stack‑expansion‑based preemption, and system‑call entry/exit which temporarily marks the G as _Gsyscall and returns to the scheduler.

Runnable Gs originate from two main sources: the creation of new goroutines via go func (handled by newproc → newproc1, which places the G onto the current P’s run queue) and I/O readiness reported by the netpoll mechanism, where the sysmon thread extracts ready Gs from epoll and injects them into the global run queue.

Understanding these mechanisms clarifies how Go achieves efficient, lightweight concurrency with minimal overhead, and how the scheduler balances fairness, low latency, and utilization of multi‑core hardware.

concurrencyGoSchedulerRuntimeGMPgoroutinePreemption
Didi Tech
Written by

Didi Tech

Official Didi technology account

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.