How to Build a Scalable Business Operation Log System with Canal and Elasticsearch
This article walks through the design and implementation of a decoupled, high‑performance business operation log solution that captures MySQL binlog changes via Canal, streams them through Kafka, and stores and queries them in Elasticsearch, addressing challenges such as batch operations, multi‑table transactions, and non‑business data filtering.
Problem Origin
Business systems need operation logs to trace user actions, record data changes, and support rollback. The required features include recording user, time, function, log type, description, and before‑after data, providing a visual query page, and supporting erroneous operation rollback.
Log Types
System logs : debug, info, warn, error levels, mainly for developers and ops.
Operation logs : record actual business actions (who, when, what) and are stored in databases for users or admins.
Limitations of Traditional Implementations
Embedding logs in business code
Manually adding log statements around each DB operation increases code complexity and lacks generality.
AOP (Aspect‑Oriented Programming)
While AOP separates logging from business logic, it struggles with capturing before‑and‑after values, batch operations, and multi‑table associations. An example using newData and oldData annotations shows issues such as needing extra queries for old values, difficulty handling List types, and inability to log multi‑table operations.
Solution Exploration
To overcome these problems, the article proposes using Canal , an open‑source Alibaba component that subscribes to MySQL binlog events.
Canal Technical Principle
Canal captures real‑time changes (INSERT, UPDATE, DELETE) from MySQL binlog and converts them into consumable events.
Why Choose Canal?
Decouples business code – no code changes required.
Supports batch operations and multi‑table associations by reading changes at the database layer.
Language‑agnostic – works with any tech stack.
Canal Pros and Cons
Advantages : removes coupling of new/old data, handles batch and multi‑table scenarios, language independent.
Disadvantages : requires unified table design, may face compatibility issues with complex cascades, and needs filtering of non‑business changes.
Implementation
Data Parsing and Transformation
Canal reads the business database binlog, parses it, and pushes the events to Kafka. A sample parsed JSON includes fields such as data, old, database, table, type, and ts:
{
"data": [{
"id": "122158992930664499",
"goodsName": "新商品名称",
"update_time": "2020-08-26 13:45:46"
}],
"old": [{
"goodsName": "旧商品名称",
"update_time": "2020-08-26 09:15:13"
}],
"database": "db_business",
"table": "goods",
"type": "UPDATE",
"ts": 1587879945698
}Defining a Generic Interface
A unified API returns before‑and‑after data along with field descriptions. An example payload for a product update includes id, groupID, newObject, oldObject, fieldsDescription, action, description, operator, databaseName, tableName, module, txID, and timestamp:
{
"id": "10001",
"groupID": 1700,
"system": "01",
"newObject": {"goodsName": "商品名称001", "goodsCode": "商品编码001"},
"oldObject": {"goodsName": "商品名称", "goodsCode": "商品编码"},
"fieldsDescription": {"goodsID": "商品ID", "goodsName": "商品名称", "goodsCode": "商品编码"},
"action": 2,
"description": "修改商品信息",
"operator": "user001",
"databaseName": "db_business",
"tableName": "goods",
"module": "商品管理",
"txID": "36aef98585db4e7a98f9694c8ef28b8c",
"timestamp": 1587879945698
}Field explanations (e.g., groupID = group ID, operator = operator, action = 0‑add, 1‑delete, 2‑modify) are listed to aid front‑end display.
Storing Logs in Elasticsearch
Because the schema can vary, a NoSQL store is chosen. Elasticsearch indices are created per month per business line. The mapping defines properties such as id, groupID, bizSource, action, description, operator, databaseName, tableName, bizmodule, txId, and nested objects newObject, oldObject, and fieldsDescription:
PUT goods-nested
{
"mappings": {
"properties": {
"id": {"type": "integer"},
"groupID": {"type": "integer"},
"bizSource": {"type": "keyword"},
"action": {"type": "integer"},
"description": {"type": "keyword"},
"operator": {"type": "keyword"},
"databaseName": {"type": "keyword"},
"tableName": {"type": "keyword"},
"bizmodule": {"type": "keyword"},
"txId": {"type": "keyword"},
"newObject": {
"type": "nested",
"properties": {
"goodsID": {"type": "integer"},
"goodsName": {"type": "keyword"},
"goodsCode": {"type": "keyword"}
}
},
"oldObject": {
"type": "nested",
"properties": {
"goodsID": {"type": "integer"},
"goodsName": {"type": "keyword"},
"goodsCode": {"type": "keyword"}
}
},
"fieldsDescription": {
"type": "nested",
"properties": {
"goodsID": {"type": "integer"},
"goodsName": {"type": "keyword"},
"goodsCode": {"type": "keyword"}
}
}
}
}
}Data Insertion Example
POST goods-nested/_bulk
{ "index": { "_index": "goods-nested", "_id": "10001" } }
{ "id":"10001","groupID":1700,"bizSource":"Scm","newObject":{"goodsID":1001,"goodsName":"商品名称001","goodsCode":"商品编码001"},"oldObject":{"goodsID":1001,"goodsName":"商品名称","goodsCode":"商品编码"},"fieldsDescription":{"goodsName":"商品名称","goodsCode":"商品编码"},"action":2,"description":"修改集团品相","operator":"001","databaseName":"db_supply_chain_basic","tableName":"tbl_chain_distribution","bizmodule":"集团品相","txId":"36aef98585db4e7a98f9694c8ef28b8c" }
... (additional bulk entries omitted for brevity)Data Query Example
Query logs for a specific product ID:
GET goods-nested/_search
{
"query": {
"nested": {
"path": "newObject",
"query": {
"bool": {
"must": [
{ "match": { "newObject.goodsID": "1001" } }
]
}
}
}
}
}A more complex query filters by groupID and multiple goodsName values using terms inside a bool must clause.
Handling Multi‑Table Associations
When an operation spans multiple tables, the binlog order may be inconsistent. The solution aggregates operations belonging to the same transaction using the transaction ID ( txID) to enable complete tracing and rollback.
Filtering Non‑Business Changes
Binlog may contain changes from DB maintenance tasks. A filtering step removes these non‑business events, retaining only true business operation logs.
Conclusion
The presented design successfully decouples business logic from logging, captures database‑level changes with Canal, and stores them in Elasticsearch for efficient querying and visualization. Remaining challenges include more sophisticated multi‑table aggregation and ensuring accurate operator attribution, which require ongoing refinement.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Mingyi World Elasticsearch
The leading WeChat public account for Elasticsearch fundamentals, advanced topics, and hands‑on practice. Join us to dive deep into the ELK Stack (Elasticsearch, Logstash, Kibana, Beats).
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
