Designing Efficient RPC Message Protocols: Boundaries, Structures, and Compression

This article explains the core principles of RPC message protocol design, covering how to determine message boundaries, choose between delimiter and length‑prefix methods, handle explicit and implicit message structures, apply compression, and implement varint and zigzag encoding for optimal traffic efficiency.

Java Backend Technology
Java Backend Technology
Java Backend Technology
Designing Efficient RPC Message Protocols: Boundaries, Structures, and Compression

In this section we start explaining the basic principles behind RPC message protocol design, understand the basic points to consider when developing RPC protocols. After mastering the principles, we can design our own protocol for a custom RPC system.

For a stream of messages we must be able to determine message boundaries, extract each message's byte‑segment, and deserialize it according to defined rules.

Message representation refers to the visual form of the serialized byte stream; text is human‑friendly, binary is computer‑friendly.

Each message has an internal field structure that dictates the order of field serialization.

Message Boundaries

RPC transmits multiple messages over a single TCP connection, so a clear delimiter rule is required between consecutive messages. Two common splitting methods are the special delimiter method and the length‑prefix method.

Special delimiter: the sender appends a unique delimiter (commonly \r\n) to each message and ensures the payload does not contain this delimiter. When the receiver encounters \r\n, it knows the preceding bytes form a complete message. Protocols such as HTTP and Redis use this delimiter, which works well for text messages.

Length‑prefix: the sender adds a 4‑byte integer at the start of each message indicating the body length. The receiver reads the length first, then reads that many bytes to obtain the full message. This method suits binary protocols.

Advantages and disadvantages are opposite: delimiters are readable but unsuitable for binary data; length prefixes handle any data but are less readable. HTTP combines both: headers are text separated by \r\n, while the body length is indicated by the Content‑Length header.

HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/2.7.13
Date: Thu, 10 May 2018 02:38:03 GMT
Content-type: text/html; charset=utf-8
Content-Length: 10393
# 省略 10393 字节消息体数据

Message Structure

Messages may have explicit or implicit structures. JSON is an explicit structure where the format is directly visible in the payload.

{
  "firstName": "John",
  "lastName": "Smith",
  "gender": "male",
  "age": 25,
  "address": {
    "streetAddress": "21 2nd Street",
    "city": "New York",
    "state": "NY",
    "postalCode": "10021"
  },
  "phoneNumber": [
    {"type": "home", "number": "212 555-1234"},
    {"type": "fax", "number": "646 555-4567"}
  ]
}

JSON is readable but redundant; each message repeats keys even when only values change. To reduce traffic, protocols like Avro negotiate the structure once and then transmit only values.

Implicit structures rely on code to define field order; the payload is pure binary and the meaning of each byte is determined by the program.

// Sender writes message
class AuthUserOutput {
    int platformId;
    long deviceId;
    String productId;
    String channelId;
    String versionId;
    String phoneModel;
    public void writeImpl() {
        writeByte((byte) this.platformId);
        writeLong(deviceId);
        writeStr(productId);
        writeStr(channelId);
        writeStr(versionId);
        writeStr(phoneModel);
    }
}

// Receiver reads message
class AuthorizeInput {
    int platformId;
    long deviceId;
    String productId;
    String channelId;
    String versionId;
    String phoneModel;
    public void readImpl() {
        this.platformId = readByte();
        this.deviceId = readLong();
        this.productId = readStr();
        this.channelId = readStr();
        this.versionId = readStr();
        this.phoneModel = readStr();
    }
}

Implicit structures save bandwidth because no structural metadata is transmitted.

Message Compression

When messages become large, compression can reduce bandwidth at the cost of increased CPU usage. Choose compression libraries implemented in C for performance; Google’s Snappy is a popular choice, used by Alibaba’s SOFA RPC.

Extreme Traffic Optimization

Many RPC protocols use variable‑length integers (varint) to encode small numbers in fewer bytes. The most‑significant bit of each byte indicates whether more bytes follow (1 = continue, 0 = last).

Negative numbers are handled with zigzag encoding, which maps signed integers to unsigned values before varint encoding.

0  => 0
-1 => 1
1  => 2
-2 => 3
2  => 4
-3 => 5
3  => 6

Zigzag turns negatives into odd numbers and positives into even numbers; decoding reverses the process.

Summary

We have covered the fundamental principles of RPC message protocol design, including how to define clear message boundaries, choose appropriate structure representations, apply compression wisely, and employ varint and zigzag techniques for traffic optimization. Following these guidelines enables the creation of efficient, custom RPC protocols.

The next section will analyze the widely used Redis message protocol as a concrete example.

Exercise

Implement a varint and zigzag encoder/decoder. Efficiency is not required; focus on correct input and output handling.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

RPCserializationZigzagcompressionVarintMessage Protocol
Java Backend Technology
Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.