How to Collect and Analyze JuiceFS Access Logs with Volcengine TLS
This article explains how to gather JuiceFS access logs using the LogCollector agent, parse and structure them with TLS, design index fields, build analytical dashboards, run advanced SQL queries for write‑IO distribution, sequential‑read ratios, overwrite detection, file‑lifecycle analysis, and set up real‑time monitoring and alerting for performance anomalies.
Business Background
JuiceFS is a high‑performance, cloud‑native distributed file system released under the Apache 2.0 license, offering full POSIX compatibility and the ability to mount object storage as local disks across multiple hosts.
Its access logs record every operation (type, UID, GID, inode, duration, etc.) and are useful for performance analysis, auditing, and troubleshooting, but the logs are scattered across many clients and servers.
Volcengine Log Service (TLS) Features
Unified collection and management of distributed JuiceFS logs.
Parsing to a uniform structure for easier analysis.
SQL‑based deep query capabilities.
Pre‑built analysis dashboards.
Real‑time monitoring and alerting.
JuiceFS Log Format
2021.01.15 08:26:11.003330 [uid:0,gid:0,pid:4403] write (17669,8666,4993160): OK <0.000010>The fields represent timestamp, user/group/process IDs, operation type, parameters (inode, size, offset), result status, and execution time.
Log Collection with LogCollector
LogCollector is a high‑performance, low‑resource log collector that can be installed on each JuiceFS client. After installation, you configure collection rules and start parameters via the TLS console.
Log Parsing Rules
Using TLS conditional processor plugins, the raw log line (field __content__) is parsed with regular expressions to extract the following fields: time: timestamp uid, gid,
pid op: operation type (write, read, open, etc.) result: operation result (OK or error)
Operation‑specific parameters such as inode, length, offset, filename, mode, filehandle, status, delay, etc.
Examples of regular expressions for different operations are provided in the source.
TLS Index Design
Key index fields include uid, gid, pid, op, inode, length, offset, filename, filehandle, status, mode, parent_inode, delay, and time. These enable fast filtering and aggregation.
Dashboard Examples
Pre‑built dashboard templates cover common scenarios such as write‑operation counts, IO size distribution, sequential‑read ratios, overwrite analysis, and file‑lifecycle statistics. Users can also create custom dashboards.
SQL Queries
Write‑Operation Statistics
op:"write" |
SELECT COUNT(*) AS "总次数",
AVG(length) AS "平均大小",
AVG(delay)*1000 AS "平均执行时间/ms",
MAX(delay)*1000 AS "最大执行时间/ms",
MIN(delay)*1000 AS "最小执行时间/ms"Write IO Size Distribution
op: "write" |
SELECT CASE
WHEN length < 4*1024 THEN '0~4K'
WHEN length < 8*1024 THEN '4K~8K'
WHEN length < 16*1024 THEN '8K~16K'
WHEN length < 32*1024 THEN '16K~32K'
WHEN length < 64*1024 THEN '32K~64K'
WHEN length < 128*1024 THEN '64K~128K'
WHEN length < 256*1024 THEN '128K~256K'
ELSE '>256K' END AS length,
COUNT(*) AS cnt
FROM (
SELECT CASE
WHEN length < 4*1024 THEN '0~4K'
WHEN length < 8*1024 THEN '4K~8K'
WHEN length < 16*1024 THEN '8K~16K'
WHEN length < 32*1024 THEN '16K~32K'
WHEN length < 64*1024 THEN '32K~64K'
WHEN length < 128*1024 THEN '64K~128K'
WHEN length < 256*1024 THEN '128K~256K'
ELSE '>256K' END AS length
FROM log
WHERE op = "write"
) t
GROUP BY lengthSequential‑Read Ratio
op: "read" |
WITH flagged AS (
SELECT __path__, inode, length,
CASE WHEN offset = LAG(offset) OVER (PARTITION BY __path__, inode ORDER BY time)
+ LAG(length) OVER (PARTITION BY __path__, inode ORDER BY time)
THEN length ELSE 0 END AS sequentialReadSize
FROM log
WHERE op = "read"
)
SELECT CASE
WHEN ratio < 0.2 THEN '0~20%'
WHEN ratio < 0.4 THEN '20%~40%'
WHEN ratio < 0.6 THEN '40%~60%'
WHEN ratio < 0.8 THEN '60%~80%'
ELSE '80%~100%'
END AS "scope",
COUNT(*) AS cnt
FROM (
SELECT __path__, inode,
SUM(sequentialReadSize) * 1.0 / SUM(length) AS ratio
FROM flagged
GROUP BY __path__, inode
) t
GROUP BY 1Overwrite Detection
op: "write" |
WITH flagged AS (
SELECT __path__, inode, offset, length,
CASE WHEN LAG(offset) OVER (PARTITION BY __path__, inode ORDER BY offset, time) IS NULL
THEN -1 ELSE LAG(offset) OVER (PARTITION BY __path__, inode ORDER BY offset, time) END AS lastOffset,
CASE WHEN LAG(length) OVER (PARTITION BY __path__, inode ORDER BY offset, time) IS NULL
THEN -1 ELSE LAG(length) OVER (PARTITION BY __path__, inode ORDER BY offset, time) END AS lastLength
FROM log
WHERE op = "write"
)
SELECT SUM( MAX(0, MIN(offset+length, lastOffset+lastLength) - MAX(offset, lastOffset)) ) / (1024*1024*1024) AS duplicateSize
FROM flagged
WHERE lastOffset != -1 AND lastLength != -1File Lifecycle Distribution
op:"unlink" OR op:"create" |
SELECT lifeTime, COUNT(*) AS cnt
FROM (
SELECT CASE
WHEN DATE_DIFF('MINUTE', createTime, unlinkTime) < 10 THEN '0~10min'
WHEN DATE_DIFF('MINUTE', createTime, unlinkTime) < 30 THEN '10~30min'
WHEN DATE_DIFF('MINUTE', createTime, unlinkTime) < 60 THEN '30~60min'
ELSE '>60min' END AS lifeTime
FROM (
SELECT __path__, inode, DATE_PARSE(time, '%Y.%m.%d %H:%i:%s.%f') AS unlinkTime
FROM log WHERE op = 'unlink'
) AS u
JOIN (
SELECT __path__, inode, DATE_PARSE(time, '%Y.%m.%d %H:%i:%s.%f') AS createTime
FROM log WHERE op = 'create'
) AS c ON u.inode = c.inode AND u.__path__ = c.__path__
) t
GROUP BY lifeTime
ORDER BY cntReal‑Time Monitoring & Alerting
TLS provides alarm rules that run periodic SQL queries. When a condition (e.g., write latency > 2 seconds) is met, an alert is sent via a notification group (e.g., Feishu webhook) using a custom content template.
op: "write" |
SELECT __path__ AS task, inode,
SUM(length) AS writeSize,
SUM(delay) AS consumedTime
FROM log
WHERE op = "write"
GROUP BY __path__, inode
ORDER BY writeSize DESC
LIMIT 1000The alarm condition checks consumedTime > 2 and triggers a warning level alert every 10 minutes.
Conclusion
By leveraging Volcengine TLS for unified collection, parsing, indexing, dashboarding, SQL analysis, and alerting, JuiceFS users can turn scattered access logs into a powerful observability platform covering basic statistics, sequential‑read detection, overwrite analysis, lifecycle insights, and real‑time performance monitoring.
Volcano Engine Developer Services
The Volcano Engine Developer Community, Volcano Engine's TOD community, connects the platform with developers, offering cutting-edge tech content and diverse events, nurturing a vibrant developer culture, and co-building an open-source ecosystem.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
