Backend Development 31 min read

Publishing, Registering, Discovering, Monitoring, Tracing and Governing RPC Services in Microservice Architecture

This article explains how to describe, publish, register, discover, invoke, monitor, trace, and govern RPC services in a microservice architecture, covering RESTful API, XML configuration, IDL files, registry principles, Zookeeper deployment, connection methods, server processing models, monitoring metrics, tracing concepts, and common governance techniques such as load balancing and fault tolerance.

Architects' Tech Alliance

Oct 8, 2018

Publishing, Registering, Discovering, Monitoring, Tracing and Governing RPC Services in Microservice Architecture

1. How to Publish and Reference Services

Service description is the first step for service invocation and can be done via three common methods: RESTful API, XML configuration, and IDL files.

RESTful API

Advantage: Uses the public HTTP/HTTPS protocol, which has almost no learning cost for consumers.

Disadvantage: Relatively low performance.

XML Configuration

Typically used by private RPC frameworks because XML‑based protocols are faster than HTTP. Publishing and referencing steps:

Provider defines and implements the interface.

Provider loads server.xml at startup to expose the interface.

Consumer loads client.xml at startup to import the interface.

Advantages: High performance for private RPC. Disadvantages: High code intrusion and the need to update both sides when the XML changes.

IDL Files

Interface Description Language (IDL) provides a neutral way to describe interfaces across platforms and languages. Common IDLs include Facebook’s Thrift and Google’s gRPC.

Advantage: Enables cross‑language service calls.

Disadvantage: Large IDL files become hard to maintain, and any change forces all consumers to update.

Summary : Choose XML for simple internal Java services, IDL for multi‑language environments, and RESTful API for external exposure.

2. How to Register and Discover Services

Registry Principle

In microservice architecture there are three roles: Service Provider (RPC Server), Service Consumer (RPC Client), and Registry. Providers publish service info from server.xml to the registry; consumers subscribe to the registry using client.xml. The registry synchronizes node changes and provides load‑balanced node lists to clients.

Registry Implementation

Service registration API (register, deregister, heartbeat, subscribe, query, modify).

Cluster deployment for high availability, often using Zookeeper.

Zookeeper Working Principle

Each server keeps a copy of data in memory; clients can read from any server.

Leader election via Paxos, leader handles data updates via ZAB.

Ensures high availability and consistency.

Directory Storage

Zookeeper stores service information in a hierarchical znode structure, each znode has a unique path and can contain data and child znodes, supporting versioned data.

Health Check

Registry monitors provider health via long‑lived TCP sessions and heartbeat messages; unresponsive sessions cause the node to be removed.

Change Notification

When a node is added or removed, the registry notifies all subscribed consumers via Zookeeper Watcher.

Whitelist Mechanism

Only nodes listed in a whitelist are allowed to register, preventing accidental test nodes from entering production.

Summary : The registry is the glue that decouples providers and consumers, offering high‑availability node management, health detection, and change notification.

3. How to Implement RPC Remote Calls

Client‑Server Network Connection

HTTP Communication

Based on the application‑layer HTTP protocol over TCP. A request triggers a TCP three‑way handshake, and the connection is closed with a four‑way handshake.

Socket Communication

Uses TCP/IP sockets. Steps:

Server binds a port with bind() and starts listening via listen().

Client connects using connect().

Server accepts the connection with accept().

Data exchange occurs via send() and receive().

Network anomalies are handled by link‑alive detection (heartbeat) and reconnection retries with back‑off intervals.

Server Request Handling Models

Synchronous Blocking (BIO)

Each request creates a new thread; suitable for low‑concurrency scenarios.

Synchronous Non‑Blocking (NIO)

Uses I/O multiplexing (select) to handle many connections with a single thread; lower overhead but more complex.

Asynchronous Non‑Blocking (AIO)

Client initiates I/O and receives a completion notification; best for high‑concurrency, heavy‑I/O workloads but hardest to program.

Recommendation: Use mature frameworks such as Netty or Apache MINA.

4. How to Monitor Microservice Calls

Monitoring Objects

User‑side monitoring.

Interface (RPC) monitoring.

Resource monitoring (e.g., Redis).

Infrastructure monitoring (CPU, MEM, I/O, bandwidth).

Metrics

Request volume (QPS, PV).

Response time (average, percentile, slow‑request buckets).

Error rate.

Dimensions

Global, data‑center, machine, time, and core‑business dimensions.

Monitoring System Workflow

Data collection (agent or proxy).

Data transmission (UDP or Kafka).

Data processing (real‑time via Storm/Spark Streaming, offline via MapReduce/Spark).

Data storage (Elasticsearch for indexing, OpenTSDB for time‑series).

Data visualization (line charts, pie charts, heatmaps).

Sampling rate must balance real‑time accuracy with system overhead.

5. How to Trace Microservice Calls

Purpose of Tracing

Identify bottlenecks (network latency, gateway failures, service crashes, DB/cache issues).

Optimize call paths and reduce cross‑data‑center latency.

Generate topology graphs for dependency analysis.

Propagate business context (e.g., A/B testing flags) across services.

Tracing Fundamentals

traceId

: Unique identifier for a user request. spanId: Identifier for a specific RPC call within the trace. annotation: Custom business data attached to a span.

Originated from Google’s Dapper paper; modern implementations include Zipkin, Pinpoint, Alibaba EagleEye, etc.

Tracing Architecture

Data collection layer – instrument code and report spans.

Data processing layer – aggregate and store spans (e.g., HBase, Hive).

Data presentation layer – visual call‑chain graphs and topology maps.

6. Service Governance Techniques

Node Management

Provider failures (crash, process exit) and network failures.

Registry‑driven heartbeat removal and client‑side removal mechanisms.

Load‑balancing algorithms: random, round‑robin, least‑active, consistent hash.

Service Routing

Static vs. dynamic routing rules (gray release, IDC‑aware routing).

Fault Tolerance

FailOver – retry on failure.

FailBack – delayed retry based on failure details.

FailCache – cache failures and retry later.

FailFast – immediate failure for non‑critical calls.

Idempotent calls can use FailOver or FailCache; non‑idempotent calls should prefer FailBack or FailFast.

Overall Summary : The article provides a comprehensive guide to service description, registration, discovery, RPC communication, monitoring, tracing, and governance in microservice systems, illustrating best‑practice patterns and common open‑source tools.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

monitoring RPC service discovery Service Registration Tracing Service Governance

Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.