Operations 14 min read

Simplify Monitoring with Categraf: All‑in‑One Agent for Metrics, Logs, and Traces

Categraf is an all‑in‑one, Go‑based monitoring agent that consolidates metric, log, and trace collection, offering remote_write support, lightweight deployment, and extensive plugin configurations to replace multiple exporters in Prometheus‑based observability stacks.

Ops Development Stories
Ops Development Stories
Ops Development Stories
Simplify Monitoring with Categraf: All‑in‑One Agent for Metrics, Logs, and Traces

What Is Categraf

Categraf is a monitoring collection agent similar to Telegraf, Grafana‑Agent, and Datadog‑Agent, designed to provide out‑of‑the‑box data collection for common monitoring targets, including metrics, logs, and traces, using an all‑in‑one architecture.

Key Advantages

Supports the

remote_write

protocol and can write to Prometheus, M3DB, VictoriaMetrics, InfluxDB, etc.

Collects only numeric metric values; string values are omitted, and tags maintain a stable structure.

All‑in‑one design: a single agent handles metrics, logs, and future trace collection.

Pure Go implementation with static compilation, minimal dependencies, easy distribution and installation.

Implements best‑practice defaults, avoiding unnecessary data collection and reducing high‑cardinality issues at the source.

Provides ready‑made dashboards and alert rules for quick import.

Planned as a core component of the KuaiMao SaaS product, encouraging community contributions.

Installation

Installation is straightforward using binary releases:

<code># download
$ wget https://download.flashcat.cloud/categraf-v0.2.38-linux-amd64.tar.gz
# extract
$ tar xf categraf-v0.2.38-linux-amd64.tar.gz
# enter directory
$ cd categraf-v0.2.38-linux-amd64/</code>

After extraction, edit

conf/config.toml

to set the remote write URL and heartbeat options, then start the agent:

<code>$ nohup ./categraf &>categraf.log &</code>

Configuration Details

The default configuration directory

conf

contains several TOML/YAML files:

config.toml

– main configuration.

logs.toml

– log‑agent settings.

prometheus.toml

– Prometheus‑agent settings.

traces.yaml

– trace‑agent settings.

conf/input/*.toml

– plugin‑specific configurations.

Main Config ( config.toml )

<code>[global]
print_configs = false
hostname = ""
omit_hostname = false
precision = "ms"
interval = 15
providers = ["local"]

[log]
file_name = "stdout"
max_size = 100
max_age = 1
max_backups = 1
local_time = true
compress = false

[writer_opt]
batch = 1000
chan_size = 1000000

[[writers]]
url = "http://127.0.0.1:17000/prometheus/v1/write"
basic_auth_user = ""
basic_auth_pass = ""
timeout = 5000
dial_timeout = 2500
max_idle_conns_per_host = 100

[http]
enable = false
address = ":9100"
print_access = false
run_mode = "release"

[ibex]
enable = false
interval = "1000ms"
servers = ["127.0.0.1:20090"]
meta_dir = "./meta"

[heartbeat]
enable = true
url = "http://127.0.0.1:17000/v1/n9e/heartbeat"
interval = 10
basic_auth_user = ""
basic_auth_pass = ""
timeout = 5000
dial_timeout = 2500
max_idle_conns_per_host = 100
</code>

Log Collection ( logs.toml )

<code>[logs]
api_key = "ef4ahfbwzwwtlwfpbertgq1i6mq0ab1q"
enable = false
send_to = "127.0.0.1:17878"
send_type = "http"
topic = "flashcatcloud"
use_compress = false
send_with_tls = false
batch_wait = 5
run_path = "/opt/categraf/run"
open_files_limit = 100
scan_period = 10
frame_size = 9000
collect_container_all = true

[[logs.items]]
type = "file"
path = "/opt/tomcat/logs/*.txt"
source = "tomcat"
service = "my_service"
</code>

Log processing rules can be defined under

logs.Processing_rules

or within each item via

logs.items.logs_processing_rules

. Supported rule types include:

exclude_at_match

– drop matching log lines.

include_at_match

– keep only matching lines.

mask_sequences

– replace sensitive patterns.

multi_line

– merge multi‑line logs based on a start‑line pattern.

Metric Collection ( prometheus.toml )

Categraf can also scrape Prometheus‑style metrics. Example configuration to scrape

kube‑state‑metrics

:

<code>[prometheus]
enable = false
scrape_config_file = "/path/to/in_cluster_scrape.yaml"
log_level = "info"
</code>
<code>global:
  scrape_interval: 15s
  external_labels:
    scraper: ksm-test
    cluster: test
scrape_configs:
  - job_name: "kube-state-metrics"
    metrics_path: "/metrics"
    kubernetes_sd_configs:
      - role: endpoints
        api_server: "https://172.31.0.1:443"
        tls_config:
          ca_file: /etc/kubernetes/pki/ca.crt
          cert_file: /etc/kubernetes/pki/apiserver-kubelet-client.crt
          key_file: /etc/kubernetes/pki/apiserver-kubelet-client.key
          insecure_skip_verify: true
    scheme: http
    relabel_configs:
      - source_labels: [__meta_kubernetes_namespace,__meta_kubernetes_service_name,__meta_kubernetes_endpoint_port_name]
        action: keep
        regex: kube-system;kube-state-metrics;http-metrics
remote_write:
  - url: "http://172.31.62.213/prometheus/v1/write"
</code>

Trace Collection ( traces.toml )

The trace configuration wraps an OpenTelemetry Collector, allowing integration with various back‑ends; detailed settings are omitted for brevity.

Plugin Example: Process Monitoring

To monitor an Nginx process, edit

conf/input.procstat/procstat.toml

:

<code># collect interval
interval = 15

[[instances]]
search_exec_substring = "nginx"
metrics_name_prefix = "nginx"
labels = { region="cloud", product="n9e" }
gather_total = true
gather_per_pid = false
</code>

After updating the plugin configuration, restart Categraf to apply changes. The collected metrics will appear in your monitoring dashboard, and you can add custom labels (e.g.,

group="ops"

) to enrich the data.

Conclusion

Categraf supports around 60 plugins covering most middleware and cloud platforms, making it a comprehensive replacement for multiple exporters. While it simplifies many monitoring scenarios, the impact on system resources and performance should be evaluated for heavily‑instrumented environments.

monitoringConfigurationagentPrometheuslog collectionCategraftrace collection
Ops Development Stories
Written by

Ops Development Stories

Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.