How Qcmd Revolutionizes Large‑Scale Server Automation Compared to SaltStack
This article explains how 360's Qcmd, a Golang‑based real‑time command execution system, overcomes SaltStack's limitations to reliably manage tens of thousands of servers with high success rates, flexible scripting, detailed monitoring, and efficient message handling.
Background
360 Private Cloud originally used SaltStack for automated operations, which initially met most needs. As network architecture grew more complex and the demand for higher task success rates increased, SaltStack became cumbersome, especially when shifting from bulk task execution to running small‑batch commands across massive clusters.
Why Qcmd Was Created
To address these challenges, the team combined proven technologies and built a SaltStack‑like tool called Qcmd .
Qcmd Overview
Qcmd is a Golang‑developed, real‑time, asynchronous command execution system. It supports tens of thousands of servers, offers good scalability, has simple and readable source code, executes quickly, provides second‑level communication between servers, is easy to deploy, and includes an automatic update module.
Key Advantages
Qcmd delivers a very high command‑execution success rate (near‑zero message loss even under poor network conditions), more reasonable timeout handling (timeouts trigger callbacks), detailed APIs that expose master status, and superior performance and task processing.
Command Execution System
The system can execute scripts on multiple hosts, combine multiple scripts, and currently covers over 20,000 hosts, supporting execution on ten‑thousands of machines using the Qcmd engine.
Script Editing
In script‑editing mode, users can freely write scripts, categorize them, and define attributes such as script name, timeout, and parameter list.
Script Execution
Users can select multiple hosts to run commands. Hosts with abnormal agents are automatically placed in an “uncontrollable” list and filtered out during execution. Any host can be chosen as a test host.
Execution History
The history module displays details of all past tasks, allowing users to redo or cancel tasks.
Qcmd System – Principle
Each Minion corresponds to a Topic and a Channel. When the Master publishes a command to the appropriate Topic/Channel, Minions subscribe to the messages and execute the corresponding actions.
Process Flow
The master starts a local Unix socket to listen for requests. Valid requests are written to the appropriate topic; invalid ones are discarded. If no valid message exists, the system returns “No minions matched the target”.
The topic launches a messageDump goroutine that continuously reads from memoryChan . If there is no data, it blocks; otherwise, it formats the data into a message and writes it to the channel’s memoryChan .
The channel also launches a messageDump goroutine that reads from its memoryChan and writes to minionChan , a channel with length 1. When data is written, it blocks until a Minion reads it.
Messages in the channel live for half of the task timeout. If this exceeds max_message_expired_time , it is capped at that value. When the channel is full, the oldest message is dropped (incrementing dropcount ) and the new message is inserted. Minions read from minionChan ; if data is present they return it, otherwise they receive nothing.
Comparison with SaltStack
The following diagram compares Qcmd with SaltStack.
Conclusion
Future posts will periodically share the latest results and ideas about Qcmd and automated operations. Stay tuned for more insights.
360 Zhihui Cloud Developer
360 Zhihui Cloud is an enterprise open service platform that aims to "aggregate data value and empower an intelligent future," leveraging 360's extensive product and technology resources to deliver platform services to customers.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.