How to Build Docker Container Monitoring with CAdvisor, InfluxDB & Grafana
This article explains how to design and implement a Docker container monitoring system using CAdvisor for metric collection, InfluxDB for time‑series storage, and Grafana for visualization, covering deployment, integration, common issues, and practical configuration details.
1. Container Monitoring Solution Selection
When evaluating container monitoring tools, options such as
docker stats, Scout, DataDog, Sysdig Cloud, and Sensu were considered.
docker statsprovides real‑time metrics but lacks persistence and alerting. Hosted services like Scout and DataDog are paid, while Sensu is complex to deploy. The open‑source CAdvisor was chosen for its comprehensive metrics, ease of deployment, and official Docker image.
2. Container Resource Monitoring – CAdvisor
2.1 Deployment and Operation
CAdvisor monitors container memory, CPU, network I/O, and disk I/O and offers a web UI. It stores only two minutes of data locally but can export to external databases. Deployment is simple: run the provided Docker image.
<code>docker run -d --name=cadvisor -p 8080:8080 \
--volume=/:/rootfs:ro \
--volume=/var/run:/var/run:rw \
--volume=/sys:/sys:ro \
--volume=/var/lib/docker/:/var/lib/docker:ro \
google/cadvisor:latest</code>After starting, access
http://<em>host_ip</em>:8080to view container metrics.
2.2 Integration with InfluxDB
To persist data, CAdvisor is configured to send metrics to InfluxDB. The container is launched with the following JSON configuration:
<code>{
"binds": [
"/:/rootfs:ro",
"/var/run:/var/run:rw",
"/sys:/sys:ro",
"/home/docker/var/lib/docker/:/var/lib/docker:ro"
],
"image": "forum-cadvisor",
"labels": {"type": "cadvisor"},
"command": "-docker_only=true -storage_driver=influxdb -storage_driver_db=cadvisor -storage_driver_host=influxdb.service.consul:8086 -storage_driver_user=testuser -storage_driver_password=testpwd",
"tag": "latest",
"hostname": "cadvisor-{{lan_ip}}"
}</code>2.3 Common Issues
1) Runtime Errors
Missing
findutilscaused container start failures; installing the package resolves the problem.
2) Missing Memory Statistics
Debian disables cgroup memory by default. Adding
cgroup_enable=memoryto
/etc/default/gruband updating GRUB restores memory metrics.
<code>GRUB_CMDLINE_LINUX="cgroup_enable=memory"</code>3) Incorrect Network Traffic Data
CAdvisor originally reported only the first network interface. Modifying its source to aggregate all interfaces and rebuilding the binary fixed the discrepancy.
2.4 CAdvisor Principles
CAdvisor mounts the host’s root and Docker directories, reading container information from cgroup files under
/sys/fs/cgroup. Example of reading CPU usage:
<code># cat /sys/fs/cgroup/cpu/docker/b1f25723c5c3a17df5026cb60e1d1e1600feb293911362328bd17f671802dd31/cpuacct.stat
user 95191
system 5028</code>Network statistics are read from
/proc/<PID>/net/devfor each container.
<code># cat /proc/6748/net/dev
Inter-| Receive | Transmit
face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed
eth0: 6266314 512 0 0 0 0 0 0 22787 292 0 0 0 0 0 0
lo: 5926805 5601 0 0 0 0 0 0 5926805 5601 0 0 0 0 0 0</code>3. Container Monitoring Data Storage – InfluxDB
InfluxDB is an open‑source distributed time‑series database written in Go, ideal for storing CAdvisor metrics. It runs as a Docker container with volume mounts for data persistence and Consul service registration.
Database and user setup via the InfluxDB CLI:
<code># influx
Connected to http://localhost:8086 version 1.3.5
> create database cadvisor
> create user testuser with password 'testpwd'
> grant all on cadvisor to testuser
> create retention policy "cadvisor_retention" on "cadvisor" duration 30d replication 1 default</code>3.2 Important InfluxDB Concepts
database : logical storage (e.g., cadvisor ).
timestamp : time column for each point.
fields : key‑value pairs storing metric values.
tags : indexed key‑value pairs for efficient queries.
retention policy : data lifespan configuration.
measurement : similar to a table, grouping fields and tags.
series : collection of points sharing measurement, tags, and retention policy.
3.3 InfluxDB Features
Provides aggregation functions such as
FILL(),
INTEGRAL(),
STDDEV(), and continuous queries for down‑sampling historical data.
4. Data Visualization with Grafana
Grafana runs as a Docker container, connecting to InfluxDB as a data source. After starting the container, access
http://<em>IP</em>:8888to configure the InfluxDB source and create dashboards. Panels can display CPU, memory, and network metrics, with proper unit selection (e.g., data (IEC) for byte‑based fields).
5. Conclusion
Combining CAdvisor, InfluxDB, and Grafana provides a lightweight, container‑native monitoring solution that is easy to deploy and scales with Dockerized services. The collected metrics are also valuable for anomaly detection and intelligent container scheduling algorithms.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.