Operations 11 min read

Design and Implementation of a Multi‑Task Scheduling System Based on Apache Mesos

This article describes how we identified underutilized CPU and memory resources in our company's servers, evaluated Kubernetes versus Apache Mesos, and built a non‑intrusive, Mesos‑based multi‑task scheduling system with dynamic resource reservation, monitoring, task isolation, and cluster‑wide observability, while addressing deployment challenges.

360 Tech Engineering
360 Tech Engineering
360 Tech Engineering
Design and Implementation of a Multi‑Task Scheduling System Based on Apache Mesos

We discovered that the CPU and memory resources of our company's servers were not fully utilized; leveraging idle resources without affecting business services could save significant machine resources.

The article, originally from HULK technical talks, introduces our Mesos‑based multi‑task scheduling system.

Background : Our internal cloud platform provides many physical and virtual machines, but resource utilization is low. We needed a solution that could use idle resources while ensuring service stability.

Selection : Although Kubernetes + Docker is a common choice, our machines run CentOS 6.2 with Linux 2.6 kernels, which do not support Docker (requires kernel 3.10+). Therefore we chose Apache Mesos.

Mesos Overview : Mesos is an operating system for the entire compute center, managing CPU, memory, disk, and network resources, allocating them to tasks, supporting fault‑tolerance, and allowing multiple frameworks such as Spark, Storm, Hadoop, Marathon.

Mesos Features (two‑level scheduling):

Agents report their resources to the Master.

Master offers resources to frameworks.

Frameworks match tasks with offers and reply.

Master forwards tasks and offers to the appropriate Agent.

Agent runs tasks via an Executor with resource limits.

Mesos can run various frameworks like Spark, Storm, Hadoop, Marathon.

System Architecture : The core components are Mesos Master and Agents, with Marathon as the second‑level scheduler. A custom Monitor runs alongside each Agent to collect and adjust available resources in real time.

Agents are deployed non‑intrusively: all required libraries are placed in a dedicated libs directory, and the Mesos binary’s rpath is patched using patchelf to point to this directory, avoiding changes to system paths.

Compilation required GCC 5.4 on CentOS 6.2 and disabling JavaDoc binding to avoid missing protobuf jars.

Challenges :

Non‑intrusive Agent deployment without polluting the host environment.

Real‑time monitoring and dynamic adjustment of Agent resources; we extended Mesos HTTP API to modify resource reservations without restarting agents.

Fast task deployment and isolation using Marathon, attributes, and etcd for dynamic attribute management.

Resource isolation via cgroups (CPU shared vs. CPU CFS) and rootfs + chroot for filesystem isolation.

Cluster‑wide monitoring using mesos‑exporter, Prometheus, and Grafana, leveraging Mesos HTTP APIs.

Remaining issues: manual task packaging, lack of support for periodic tasks, and limited framework diversity.

In summary, we built a Mesos‑based multi‑task scheduling system that efficiently utilizes idle resources, provides dynamic resource reservation, task isolation, and comprehensive monitoring, while acknowledging areas for further improvement.

monitoringResource SchedulingCluster ManagementMesosTask IsolationDocker Alternative
360 Tech Engineering
Written by

360 Tech Engineering

Official tech channel of 360, building the most professional technology aggregation platform for the brand.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.