Backend Development 11 min read

Connection Management in ZhiZhi RPC Framework SCF

This article explains how the ZhiZhi RPC framework SCF handles connection lifecycle, including establishment timing, connection pooling versus multiplexing, keep‑alive mechanisms, automatic recovery, and graceful shutdown, providing practical guidance for designing robust RPC protocols.

Zhuanzhuan Tech
Zhuanzhuan Tech
Zhuanzhuan Tech
Connection Management in ZhiZhi RPC Framework SCF

1. Timing of Connection Establishment

The RPC client creates a dynamic proxy and then discovers service nodes and establishes connections; SCF can be configured to discover nodes and create connections immediately or lazily. Immediate discovery and connection reduce first‑request latency, while lazy options help resolve circular dependencies and speed up test scenarios.

2. How Many Connections Are Needed

2.1 Connection Pool

Connection pools (e.g., DB, Redis, HTTP) enable reuse of TCP connections, which are heavyweight due to the three‑way handshake and OS resources. Pools also increase concurrency because a single HTTP/1.1 connection can handle only one outstanding request, while multiplexed protocols can handle many.

HTTP/1.1 pipeline and Redis pipeline improve throughput but still suffer from head‑of‑line blocking; therefore multiple connections are often required.

2.2 Multiplexing

Multiplexing allows concurrent requests on a single TCP connection when the application protocol supports request IDs (e.g., HTTP/2 stream IDs). SCF’s protocol is similar to HTTP/2 but does not split a request/response into multiple frames.

When multiplexing is supported, a connection pool becomes unnecessary; SCF normally creates a single TCP connection per client‑server pair, adding more only when large data volumes demand it.

3. Connection Keep‑Alive

3.1 TCP Keep‑Alive

Keep‑alive is not part of the TCP specification; it may close healthy connections, waste bandwidth, or increase costs. Many implementations provide it, but it is often better handled at the application layer. —TCP/IP Illustrated, Volume 1, Xie Xiren

TCP keep‑alive can be enabled via SO_KEEPALIVE, but the default interval (≈2 hours) is too long for most services.

3.2 Application‑Level Keep‑Alive

SCF uses Netty’s IdleStateHandler with two parameters: readerIdleTime (default 3 s) and idleTimeout (default 10 s). If no read occurs within readerIdleTime , the connection is marked idle; if it stays idle beyond idleTimeout , the connection is closed after a heartbeat exchange.

The handler records the last read time in a channel attribute. Upon an IdleStateEvent , if the elapsed time is less than idleTimeout , a heartbeat is sent; otherwise the connection is terminated.

Server side defaults readerIdleTime to 20 s and replies to heartbeats, closing the channel when an idle event exceeds the timeout.

Healthy‑connection heartbeat sequence:

Unhealthy‑connection heartbeat sequence:

4. Automatic Recovery

When a connection is detected as broken (e.g., network fault, machine crash, GC pause), SCF schedules a reconnection task that runs every 5 seconds to re‑establish the link.

5. Graceful Shutdown

Netty separates boss and worker event‑loop groups. The boss group stops accepting new connections, then the server sends a shutdown event to existing channels without closing them immediately, allowing in‑flight requests to finish.

Clients receiving the shutdown event mark the channel as closing, stop routing new requests, and start a timer that checks for pending responses and writer idle time. When no pending requests remain and the writer idle time exceeds a silent period, the channel is closed.

After broadcasting the shutdown event, the server periodically checks the number of active connections; once all are closed, it shuts down the worker group.

6. Summary

The article provides a detailed overview of SCF’s connection management, covering establishment timing, connection count strategies, keep‑alive mechanisms, automatic repair, and graceful shutdown, offering practical insights for developers designing their own RPC protocols.

RPCNettykeep-aliveConnection Management
Zhuanzhuan Tech
Written by

Zhuanzhuan Tech

A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.