Implementing Gray Release in Spring Cloud with Nacos, Gateway, and Custom Load Balancer
This article provides a step‑by‑step guide to implementing gray (canary) release in a Spring Cloud microservice architecture, covering core components such as Nacos registration, Spring Cloud Gateway filters, custom Ribbon load‑balancing, OpenFeign interceptors, and configuration details with full code examples.
In this tutorial, a top‑level architect explains how to achieve gray (canary) release for Spring Cloud microservices using Nacos as the service registry and configuration center, Spring Cloud Gateway for request routing, and custom Ribbon load‑balancing rules.
Core Component Overview
Registration Center: Nacos
Gateway: Spring Cloud Gateway
Load Balancer: Ribbon (or Spring Cloud LoadBalancer)
Service‑to‑Service RPC: OpenFeign
The gray release logic relies on a GrayFlagRequestHolder that stores a GrayStatusEnum (ALL, PROD, GRAY) in a ThreadLocal . The holder is populated in a pre‑filter, consulted by the custom load‑balancer, and cleared in a post‑filter or global exception handler to avoid memory leaks.
Gray Release Implementation
When a request reaches the gateway, the GrayGatewayBeginFilter checks whether the gray switch is enabled. If enabled, it evaluates request headers, IP address, city, or user ID against the configured gray criteria and sets the appropriate GrayStatusEnum in the holder.
public class GrayGatewayBeginFilter implements GlobalFilter, Ordered {
@Autowired
private GrayGatewayProperties grayGatewayProperties;
@Override
public Mono
filter(ServerWebExchange exchange, GatewayFilterChain chain) {
GrayStatusEnum grayStatusEnum = GrayStatusEnum.ALL;
if (grayGatewayProperties.getEnabled()) {
grayStatusEnum = GrayStatusEnum.PROD;
if (checkGray(exchange.getRequest())) {
grayStatusEnum = GrayStatusEnum.GRAY;
}
}
GrayFlagRequestHolder.setGrayTag(grayStatusEnum);
ServerHttpRequest newRequest = exchange.getRequest().mutate()
.header(GrayConstant.GRAY_HEADER, grayStatusEnum.getVal())
.build();
ServerWebExchange newExchange = exchange.mutate().request(newRequest).build();
return chain.filter(newExchange);
}
// ... methods to check header, IP, city, user number ...
@Override
public int getOrder() {
return Ordered.HIGHEST_PRECEDENCE;
}
}A corresponding post‑filter ( GrayGatewayAfterFilter ) simply removes the holder entry after the downstream call completes.
public class GrayGatewayAfterFilter implements GlobalFilter, Ordered {
@Override
public Mono
filter(ServerWebExchange exchange, GatewayFilterChain chain) {
GrayFlagRequestHolder.remove();
return chain.filter(exchange);
}
@Override
public int getOrder() {
return Ordered.LOWEST_PRECEDENCE;
}
}The global exception handler ( GrayGatewayExceptionHandler ) also clears the holder to prevent leaks when an exception occurs.
Custom Ribbon Load‑Balancing
The abstract class AbstractGrayLoadBalancerRule extends Ribbon’s AbstractLoadBalancerRule and overrides getReachableServers and getAllServers to filter instances based on the version metadata stored in Nacos and the current gray status.
protected List
getGrayServers(List
servers) {
List
result = new ArrayList<>();
String currentVersion = metaVersion;
GrayStatusEnum grayStatusEnum = GrayFlagRequestHolder.getGrayTag();
if (grayStatusEnum != null) {
switch (grayStatusEnum) {
case ALL: return servers;
case PROD: currentVersion = grayVersionProperties.getProdVersion(); break;
case GRAY: currentVersion = grayVersionProperties.getGrayVersion(); break;
}
}
for (Server server : servers) {
NacosServer nacosServer = (NacosServer) server;
String version = nacosServer.getMetadata().get("version");
if (version != null && version.equals(currentVersion)) {
result.add(server);
}
}
return result;
}Specific implementations such as GrayRoundRobinRule reuse Ribbon’s round‑robin algorithm while delegating server selection to the gray‑aware rule.
Spring MVC and Feign Interceptors
A GrayMvcHandlerInterceptor extracts the gray header from incoming HTTP requests and stores it in the holder, ensuring downstream Feign calls propagate the same gray tag.
public class GrayMvcHandlerInterceptor implements HandlerInterceptor {
@Override
public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) {
String grayTag = request.getHeader(GrayConstant.GRAY_HEADER);
if (grayTag != null) {
GrayFlagRequestHolder.setGrayTag(GrayStatusEnum.getByVal(grayTag));
}
return true;
}
@Override
public void afterCompletion(HttpServletRequest request, HttpServletResponse response, Object handler, Exception ex) {
GrayFlagRequestHolder.remove();
}
}The Feign interceptor ( GrayFeignRequestInterceptor ) adds the gray header to outbound Feign requests when a gray tag is present.
public class GrayFeignRequestInterceptor implements RequestInterceptor {
@Override
public void apply(RequestTemplate template) {
GrayStatusEnum grayStatusEnum = GrayFlagRequestHolder.getGrayTag();
if (grayStatusEnum != null) {
template.header(GrayConstant.GRAY_HEADER, Collections.singleton(grayStatusEnum.getVal()));
}
}
}Configuration Classes
Two property classes hold the gray‑related settings:
GrayGatewayProperties – enables the gateway gray switch, defines header key/value, and lists IPs, cities, and user IDs that trigger gray mode.
GrayVersionProperties – defines the production and gray version identifiers (e.g., V1 and V2).
Auto‑configuration classes ( GrayAutoConfiguration , GrayGatewayFilterAutoConfiguration , GrayWebMvcAutoConfiguration , GrayFeignInterceptorAutoConfiguration ) conditionally register the filters and interceptors based on the presence of required classes and the kerwin.tool.gray.load flag.
Deployment and Demo
The article provides YAML snippets for Nacos global configuration, gateway configuration, and VM options to start five services (gateway, user‑app V1/V2, order‑app V1/V2). It demonstrates three scenarios:
Gray switch disabled – all traffic goes to any available version.
Gray switch enabled, no matching criteria – only production version (V1) is called.
Gray switch enabled with matching header/IP/city – traffic is routed to the gray version (V2).
Each scenario includes screenshots of the service responses showing the port and version information, confirming that the gray routing works as expected.
Discussion and Limitations
The author notes open questions such as handling distributed scheduled tasks (e.g., XXL‑Job) and message queues in a gray deployment, suggesting separate executors or MQ clusters per version. They also mention an alternative approach using Nginx + Lua scripts with separate Nacos namespaces for production and gray environments.
Source code is available at https://gitee.com/kerwin_code/spring-cloud-gray-example .
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.