How Polaris‑Mesh Server Handles Service Instance Heartbeats and Health Checks
This article explores Polaris‑Mesh’s server‑side health‑check mechanism, detailing how client heartbeat requests are received via gRPC, processed through Apiserver, Resource Auth Filter, Service, Healthcheck, and Checker Plugin, and how the system’s configuration, plugins, and time‑wheel ensure reliable instance health monitoring.
Introduction
PolarisMesh (Polaris) is Tencent’s open‑source service governance platform that addresses service management, traffic control, configuration, fault tolerance, and observability for distributed and micro‑service architectures. This article dives into the server‑side health‑check module to understand how Polaris‑server processes client heartbeat requests and stores heartbeat data.
Prerequisites
Go environment 1.20.x or higher.
VSCode or GoLand.
Clone the Polaris‑server source code from GitHub (prefer a release branch, the article uses the main branch).
Clone the Polaris‑Java source code for reference.
Client Heartbeat Registration (Code Example)
public InstanceRegisterResponse registerInstance(InstanceRegisterRequest request, RegisterFunction registerFunction, HeartbeatFunction heartbeatFunction) {
// Send registration request to Polaris server
InstanceRegisterResponse instanceRegisterResponse = registerFunction.doRegister(request, createRegisterV2Header());
// Save registration state locally
RegisterState registerState = RegisterStateManager.putRegisterState(sdkContext, request);
if (registerState != null) {
// Schedule periodic heartbeat task for the newly registered instance
registerState.setTaskFuture(asyncRegisterExecutor.scheduleWithFixedDelay(
() -> doRunHeartbeat(registerState, registerFunction, heartbeatFunction),
request.getTtl(), request.getTtl(), TimeUnit.SECONDS));
}
return instanceRegisterResponse;
}The client reports heartbeats at the TTL interval configured by the user. The heartbeat protocol between client and server is based on gRPC.
gRPC Heartbeat Service Definition
service PolarisGRPC {
// Server receives heartbeat reports
rpc Heartbeat(Instance) returns (Response) {}
}Server Processing Flow
When a client sends a heartbeat packet, it passes through the following components in order:
Apiserver – adapts various network protocols and converts them into Polaris’s internal heartbeat request model.
Resource Auth Filter – performs resource authentication, e.g., verifies whether the instance is allowed to register under the target service.
Service – the control plane for service discovery and governance; handles registration, health‑status checks, rule retrieval, and CRUD operations for governance data.
Healthcheck – validates the heartbeat request, determines if the packet should be accepted, and periodically checks each instance’s health status.
Checker Plugin – performs CRUD operations on heartbeat data and, if no valid heartbeat is received within 3 × TTL, marks the instance as unhealthy.
Healthcheck Component Details
The polaris-server.yaml file configures the health‑check module:
healthcheck:
# Enable health‑check component on this Polaris node
open: true
# Service name of the health‑check cluster
service: polaris.checker
# Time‑wheel parameters
slotNum: 30
minCheckInterval: 1s
maxCheckInterval: 30s
clientReportInterval: 120s
batch:
# Control for batch heartbeat modifications
heartbeat:
open: true
queueSize: 10240
waitTime: 32ms
maxBatchCount: 32
concurrency: 64
# Types of health‑check plugins
checkers:
- name: heartbeatMemory # Map‑based plugin for single‑node scenarios
- name: heartbeatLeader # DB‑based leader election plugin for cluster scenariosKey sub‑components inside Healthcheck:
DefaultChecker – the default heartbeat‑based health‑check plugin.
Checkers – list of supported health‑check plugins.
CacheProvider – syncs enabled‑health‑check instance data from the Cache module into a local cache.
TimeAdjuster – aligns heartbeat timestamps across cluster nodes using a unified time source from the storage layer.
Dispatcher – builds a consistent hash ring based on healthcheck.service to determine which node is responsible for checking each instance.
CheckScheduler – receives instance data from Dispatcher, performs Add/Update/Delete, and uses a time‑wheel to trigger health checks within 3 × TTL, batching state changes to the storage layer.
LocalHost – records the IP of the current node for the health‑checker.
Dispatcher and Consistent Hashing
The Dispatcher creates a consistent hash ring to distribute responsibility for health‑checking enabled instances across nodes, enabling horizontal scaling of the health‑check service. Each instance is assigned a responsible node; that node stores the instance in its local map and notifies the CheckScheduler.
Conclusion
Polaris‑server’s health‑check architecture combines protocol adaptation, authentication, service‑level processing, and a pluggable checker framework to reliably monitor service instance liveness. The configurable time‑wheel, consistent‑hash based dispatcher, and extensible plugins allow the system to scale horizontally while providing precise, timely health status updates for micro‑service environments.
Tencent Cloud Middleware
Official account of Tencent Cloud Middleware. Focuses on microservices, messaging middleware and other cloud‑native technology trends, publishing product updates, case studies, and technical insights. Regularly hosts tech salons to share effective solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
