Operations 12 min read

Introducing oc-ops: A One‑Stop OS Operations Toolset for Linux Kernel Management

The article presents oc-ops, a unified command‑line toolset for OpenCloudOS that streamlines Linux kernel management by offering standardized syntax, sub‑commands for memory cost analysis, I/O latency monitoring, and IRQ latency detection, along with detailed usage parameters and best‑practice recommendations.

Tencent Architect
Tencent Architect
Tencent Architect
Introducing oc-ops: A One‑Stop OS Operations Toolset for Linux Kernel Management

1. Development Background

Linux kernel management has become essential in modern computing, but the diversity of problems and scattered, non‑unified operational tools make troubleshooting difficult for many operators.

To improve operational efficiency, the OpenCloudOS Operations Tools SIG created oc-ops, a one‑stop OS operations suite that unifies tool management and speeds up issue localization.

2. oc-ops Syntax

2.1 Basic Syntax

Usage: oc-ops [opt] <subcmd> [opt] [cmdargs] Supported global options: oc-ops -h: show usage and list all subcommands. oc-ops -v: display version. oc-ops -d: run in debug mode.

Design strategies for ease of use:

Tab‑key auto‑completion.

Limit subcommand arguments and prefer natural‑language aliases.

Restrict command depth to four levels.

2.2 Subcommand Usage

Use oc-ops <subcmd> -h to view detailed help for each subcommand.

3. Supported Subcommands Overview

3.1 Current Status

Install or upgrade the latest package with yum install -y opencloudos-tools.

Implemented subcommands include:

oc-ops mem checkcost
oc-ops io latency
oc-ops cpu irq latency

3.2 oc-ops mem checkcost

Function: Diagnose memory consumption, identify which memory domains (Buffers/Cached, AnonPages, Shmem, Slab, Vmalloc, HugePages) are responsible, and provide follow‑up suggestions.

Command: oc-ops mem checkcost [-n topn] [-d] [-h] Parameters: -n: show top *n* memory consumers (default 3). -d: drop caches ( echo 3 > /proc/sys/vm/drop_caches) before analysis. -h: display help.

3.3 oc-ops io latency

Function: Monitor one or more storage devices for I/O latency exceeding defined thresholds.

Command:

oc-ops io latency -d device [-s size] [-l logdir] [-a average] [-m max] [-p period] [-Q Q2C] [-D D2C] [-r] [-k] [-h]

Key parameters: -d: target device(s) (e.g., sda or sda,sdb). -s: log size limit (default 1048576 KB). -l: absolute log directory (default /data/oc-ops/io/latency). -a: average latency threshold (seconds, default 0.2 s). -m: maximum latency threshold (seconds, default 5 s). -p: monitoring period in seconds (default 60 s). -Q / -D: Q2C/D2C latency thresholds (default same as -m). -r: retain intermediate logs for deeper analysis. -k: kill all monitoring processes and stop collection. -h: display help.

Usage notes: avoid placing log directory on monitored devices or on tmpfs to prevent additional I/O or memory pressure.

3.4 oc-ops cpu irq latency

Function: Monitor interrupt response latency.

Command:

oc-ops cpu irq latency [-e val] [-f freq_ms] [-t threshold_ms] [-c] [-k] [-h]

Parameters: -e: enable (1) or disable (0) IRQ latency detection. -f: sampling frequency in ms (4 ms–1000 ms). -t: latency threshold in ms (freq_ms–30000 ms); stacks are printed when exceeded. -c: clear accumulated stack traces. -k: keep detection alive indefinitely (default auto‑stop after 3600 s). -h: display help.

To view collected data:

cat /proc/irq_latency/trace_dist
cat /proc/irq_latency/trace_stack

Example: enable IRQ latency with a 50 ms threshold – oc-ops cpu irq latency -e 1 -t 50.

4. Conclusion

On the OpenCloudOS platform, oc-ops helps users analyze memory usage, I/O latency, and interrupt latency, significantly improving problem‑location efficiency. The toolset continues to evolve, with future additions such as health checks, network packet loss detection, jitter analysis, accidental file‑deletion recovery, and signal tracing.

For more information, visit the OpenCloudOS community site and consider contributing to the project.

operationsLinuxCommand LineSystem monitoringPerformance Analysisoc-ops
Tencent Architect
Written by

Tencent Architect

We share insights on storage, computing, networking and explore leading industry technologies together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.