Build a Production-Ready Rule Engine with Gray Release Using Go, Kafka, and Redis
Learn how to design and implement a ready-to-use rule engine combined with a gray release system using Golang, Kafka, Redis, and CEL, complete with Docker‑compose deployment, edge execution, token‑bucket throttling, and webhook actions, plus full source code for a production‑grade marketing strategy platform.
This article guides you step‑by‑step to create a ready‑to‑use rule engine together with a gray release system, targeting marketing, recommendation, and risk‑control scenarios where rules must be updated quickly and safely.
Why Use a Rule Engine + Gray Release?
Traditional rule handling suffers from hard‑coded logic, slow deployment, lack of gray testing, inconsistent distribution, and weak observability, leading to operational risk and financial loss. A modern platform should provide visual rule management, a compiled DSL (e.g., CEL), real‑time distribution via Kafka, support for gray, full, and rollback releases, edge caching with degradation, token‑bucket rate limiting, and webhook‑based actions.
System Architecture
┌──────────────┐ ┌─────────────┐
│ Dashboard │ POST │ Publisher │
│(rule config) │ ───▶ │(gray release)│
└──────────────┘ └───────┬─────┘
│ Kafka
▼
┌────────────┐
│ Edge │
│(rule executor)│───▶ Webhook Server
└────────────┘
▲ Redis
│(rule cache + rate limit)Rule Compilation with CEL
CEL, an open‑source expression language from Google, offers simple syntax, safety, and high performance (5‑20× faster than Lua). Rules are written as JSON objects containing an id, condition expressed in CEL, an action, a strategy for gray percentage, and a limit_per_second for throttling.
{
"id": "rule_notify",
"version": "v1",
"enabled": true,
"condition": "user[\"register_days\"] < 30 && user[\"order_count\"] == 0 && ml[\"score\"] < 0.3",
"action": {"type": "call_webhook", "params": {"url": "http://webhook-server:9090/coupon"}},
"strategy": {"type": "percentage", "percentage": 10},
"limit_per_second": 100
}Compilation on the Edge side:
env, _ := cel.NewEnv(
cel.Variable("user", cel.MapType(cel.StringType, cel.DynType)),
)
ast, _ := env.Compile(rule.Filter)
prog, _ := env.Program(ast)
out, _, _ := prog.Eval(map[string]interface{}{ "user": userInfo })Gray Release Mechanism
The Publisher distributes rules via Kafka. It can publish:
100% – full rollout.
10% – gray rollout using consistent hashing (e.g., by userId).
Rollback – revert to a previous version.
All Edge nodes receive the update within five seconds and apply it instantly.
msg := RuleMessage{
RuleID: "rule_notify",
Expr: `user.register_days < 30 && user.order_count == 0 && ml.score < 0.3 && (region == "CN" || region == "HK")`,
GrayPercent: 10,
LimitPerSec: 100,
ActionURL: "http://webhook-server:9090/coupon",
}
value, _ := json.Marshal(msg)
writer.WriteMessages(ctx, kafka.Message{Value: value})Edge Local Rule Executor
When a client request reaches an Edge node, the executor runs four steps: gray hit check, CEL expression evaluation, Redis token‑bucket rate limiting, and webhook action execution.
// 1. Gray hit?
if !HitGray(rule.GrayPercent, user.ID) {
return "SKIPPED_GRAY"
}
// 2. Expression match?
matched, err := rule.Program.Eval(vars)
if err != nil || !matched {
return "SKIPPED_EXPR"
}
// 3. Rate limit (Redis Token Bucket)
if !TokenBucketAllow(rule.RuleID, rule.LimitPerSec) {
return "LIMITED"
}
// 4. Execute action (Webhook)
CallWebhook(rule.ActionURL, vars)
return "MATCHED"Rate limiting is crucial in marketing bursts to prevent system overload; the token‑bucket algorithm allows short spikes while protecting the coupon service.
Webhook Action Executor
The Webhook Server receives callbacks and performs business logic such as sending coupons, triggering pushes, launching marketing campaigns, notifying users, or applying risk‑control actions.
http.Post(rule.ActionURL, "application/json", bytes.NewBuffer(body))Advantages of the Architecture
Rule authors can publish instantly without code changes.
Edge nodes update silently, ensuring zero‑downtime.
Kafka guarantees consistent distribution across nodes.
Redis provides reliable rate limiting for safety.
Gray rollout (10%) mitigates full‑scale failures.
Webhook decouples marketing actions from core logic.
Source Code
GitHub: https://github.com/louis-xie-programmer/rule-engine-gray
Gitee: https://gitee.com/louis_xie/rule-engine-gray
Code Wrench
Focuses on code debugging, performance optimization, and real-world engineering, sharing efficient development tips and pitfall guides. We break down technical challenges in a down-to-earth style, helping you craft handy tools so every line of code becomes a problem‑solving weapon. 🔧💻
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
