Operations 24 min read

Understanding Scheduler Architectures: From Batch to Shared‑State Designs

This article surveys the evolution of schedulers—from early batch systems and OS process schedulers to modern centralized, two‑level, and shared‑state designs—explaining their core concepts, trade‑offs, and real‑world examples such as YARN, Mesos, Spark, Borg, and Kubernetes.

360 Zhihui Cloud Developer

May 3, 2018

Understanding Scheduler Architectures: From Batch to Shared‑State Designs

1. Definition of Scheduler

A scheduler is a core component in both single‑machine and distributed systems that decides when and where tasks run, encompassing batch schedulers, preemptive process schedulers, cron‑like tools, language runtimes (e.g., Go goroutine scheduler), and cluster resource managers such as Hadoop YARN and Airflow.

2. Scheduler Design Overview

System design often repeats similar abstractions at different layers: caches in a CPU, memory hierarchies in a machine, and storage tiers in a cluster. As scale grows, problems that were trivial become challenging, especially state synchronization, fault tolerance, and scalability.

3. Types of Distributed Schedulers

3.1 Centralized Scheduler

A single instance (monolithic) manages all resources and tasks. It is simple, offers stable state synchronization, but suffers from single‑point‑of‑failure and limited scalability.

3.2 Two‑Level Scheduler

Combines a central scheduler with partitioned sub‑schedulers. The central scheduler handles coarse‑grained allocation, while partitions manage fine‑grained tasks, improving flexibility and supporting both high‑throughput and low‑latency workloads, but increasing state‑sync complexity.

3.3 Shared‑State Scheduler

All schedulers share a common cluster state service; individual schedulers are independent services that read/write this state. This micro‑kernel style improves extensibility, fault tolerance, and scalability. Kubernetes, Borg, and the newer Ray system follow this model.

4. Representative Cases

OS Process Scheduler : Centralized management of CPU, memory, and I/O for processes and threads.

Hadoop YARN : Central ResourceManager with per‑node NodeManagers; supports high‑availability via standby masters.

Mesos : Two‑level design with a Master offering resources to independent Frameworks that run their own schedulers.

Spark : Central Driver schedules Executors; Spark Drizzle adds a local scheduler per node to reduce streaming latency.

Borg / Kubernetes : Evolved from a centralized BorgMaster to a shared‑state architecture where schedulers are separate services; uses containers/cgroups for isolation.

Omega : Treats resource allocation and task scheduling as database transactions, providing optimistic locking, dead‑lock detection, and procedural checks.

5. Summary

For small‑scale systems, a centralized scheduler is simple and effective. As clusters grow or custom scheduling policies are needed, two‑level designs become attractive, though they add complexity. Shared‑state schedulers are now mainstream, offering simple APIs and high scalability, exemplified by Kubernetes.

6. Outlook

Future work includes precise task‑demand prediction (potentially leveraging AI) and efficient large‑scale artifact distribution (e.g., container images) using peer‑to‑peer techniques.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems System Design Scheduler Cluster Computing

Written by

360 Zhihui Cloud Developer

360 Zhihui Cloud is an enterprise open service platform that aims to "aggregate data value and empower an intelligent future," leveraging 360's extensive product and technology resources to deliver platform services to customers.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.