Cloud Native 11 min read

How Polaris‑Mesh Server Handles Service Instance Heartbeats and Health Checks

This article explores Polaris‑Mesh’s server‑side health‑check mechanism, detailing how client heartbeat requests are received via gRPC, processed through Apiserver, Resource Auth Filter, Service, Healthcheck, and Checker Plugin, and how the system’s configuration, plugins, and time‑wheel ensure reliable instance health monitoring.

Tencent Cloud Middleware
Tencent Cloud Middleware
Tencent Cloud Middleware
How Polaris‑Mesh Server Handles Service Instance Heartbeats and Health Checks

Introduction

PolarisMesh (Polaris) is Tencent’s open‑source service governance platform that addresses service management, traffic control, configuration, fault tolerance, and observability for distributed and micro‑service architectures. This article dives into the server‑side health‑check module to understand how Polaris‑server processes client heartbeat requests and stores heartbeat data.

Prerequisites

Go environment 1.20.x or higher.

VSCode or GoLand.

Clone the Polaris‑server source code from GitHub (prefer a release branch, the article uses the main branch).

Clone the Polaris‑Java source code for reference.

Client Heartbeat Registration (Code Example)

public InstanceRegisterResponse registerInstance(InstanceRegisterRequest request, RegisterFunction registerFunction, HeartbeatFunction heartbeatFunction) {
    // Send registration request to Polaris server
    InstanceRegisterResponse instanceRegisterResponse = registerFunction.doRegister(request, createRegisterV2Header());
    // Save registration state locally
    RegisterState registerState = RegisterStateManager.putRegisterState(sdkContext, request);
    if (registerState != null) {
        // Schedule periodic heartbeat task for the newly registered instance
        registerState.setTaskFuture(asyncRegisterExecutor.scheduleWithFixedDelay(
                () -> doRunHeartbeat(registerState, registerFunction, heartbeatFunction),
                request.getTtl(), request.getTtl(), TimeUnit.SECONDS));
    }
    return instanceRegisterResponse;
}

The client reports heartbeats at the TTL interval configured by the user. The heartbeat protocol between client and server is based on gRPC.

gRPC Heartbeat Service Definition

service PolarisGRPC {
    // Server receives heartbeat reports
    rpc Heartbeat(Instance) returns (Response) {}
}

Server Processing Flow

When a client sends a heartbeat packet, it passes through the following components in order:

Apiserver – adapts various network protocols and converts them into Polaris’s internal heartbeat request model.

Resource Auth Filter – performs resource authentication, e.g., verifies whether the instance is allowed to register under the target service.

Service – the control plane for service discovery and governance; handles registration, health‑status checks, rule retrieval, and CRUD operations for governance data.

Healthcheck – validates the heartbeat request, determines if the packet should be accepted, and periodically checks each instance’s health status.

Checker Plugin – performs CRUD operations on heartbeat data and, if no valid heartbeat is received within 3 × TTL, marks the instance as unhealthy.

Healthcheck Component Details

The polaris-server.yaml file configures the health‑check module:

healthcheck:
  # Enable health‑check component on this Polaris node
  open: true
  # Service name of the health‑check cluster
  service: polaris.checker
  # Time‑wheel parameters
  slotNum: 30
  minCheckInterval: 1s
  maxCheckInterval: 30s
  clientReportInterval: 120s
  batch:
    # Control for batch heartbeat modifications
    heartbeat:
      open: true
      queueSize: 10240
      waitTime: 32ms
      maxBatchCount: 32
      concurrency: 64
  # Types of health‑check plugins
  checkers:
    - name: heartbeatMemory   # Map‑based plugin for single‑node scenarios
    - name: heartbeatLeader   # DB‑based leader election plugin for cluster scenarios

Key sub‑components inside Healthcheck:

DefaultChecker – the default heartbeat‑based health‑check plugin.

Checkers – list of supported health‑check plugins.

CacheProvider – syncs enabled‑health‑check instance data from the Cache module into a local cache.

TimeAdjuster – aligns heartbeat timestamps across cluster nodes using a unified time source from the storage layer.

Dispatcher – builds a consistent hash ring based on healthcheck.service to determine which node is responsible for checking each instance.

CheckScheduler – receives instance data from Dispatcher, performs Add/Update/Delete, and uses a time‑wheel to trigger health checks within 3 × TTL, batching state changes to the storage layer.

LocalHost – records the IP of the current node for the health‑checker.

Dispatcher and Consistent Hashing

The Dispatcher creates a consistent hash ring to distribute responsibility for health‑checking enabled instances across nodes, enabling horizontal scaling of the health‑check service. Each instance is assigned a responsible node; that node stores the instance in its local map and notifies the CheckScheduler.

Conclusion

Polaris‑server’s health‑check architecture combines protocol adaptation, authentication, service‑level processing, and a pluggable checker framework to reliably monitor service instance liveness. The configurable time‑wheel, consistent‑hash based dispatcher, and extensible plugins allow the system to scale horizontally while providing precise, timely health status updates for micro‑service environments.

Goservice meshHeartbeatPolarisMeshHealth Check
Tencent Cloud Middleware
Written by

Tencent Cloud Middleware

Official account of Tencent Cloud Middleware. Focuses on microservices, messaging middleware and other cloud‑native technology trends, publishing product updates, case studies, and technical insights. Regularly hosts tech salons to share effective solutions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.