Choosing and Optimizing Serialization for High‑Performance Messaging

The article explains why serialization is essential for inter‑process communication, compares common formats like JSON, Protobuf, Kryo, and custom binary schemes, outlines selection criteria such as readability, complexity, speed and density, and provides code examples and interview‑style Q&A for high‑performance messaging systems.

JavaEdge
JavaEdge
JavaEdge
Choosing and Optimizing Serialization for High‑Performance Messaging

Why Serialization Matters

When processes communicate over a network, they exchange binary streams. Programming languages and network frameworks expose APIs that send and receive bytes, but the data we need to transmit is usually structured—commands, text, or messages represented as objects. Converting these objects to a byte stream (serialization) and back (deserialization) is therefore essential.

Common Uses of Serialization

Beyond network transmission, serialization is used to persist objects to files. In large‑scale data scenarios, objects are serialized to disk to free memory and later deserialized, ensuring data durability and reducing memory pressure.

Choosing a Serialization Technique

Many serialization options exist. Simple approaches convert an object to a string and then to bytes, which works but is inefficient. Popular built‑in or open‑source solutions include:

Google Protobuf, Kryo, Hessian

Text‑based formats such as JSON and XML

Custom private implementations

Selection criteria typically consider:

Readability of the serialized data

Implementation complexity

Serialization / deserialization speed

Information density (smaller byte size)

No single format excels in all dimensions; trade‑offs must be balanced based on business needs.

Readability vs. Density

JSON / XML: highest readability, lowest density.

Kryo / Hessian: binary, good performance, moderate density.

Practical Recommendation

For most business systems (e‑commerce, social apps) where performance requirements are moderate, JSON is recommended because it is easy to use and human‑readable, despite higher CPU and storage costs.

Example: Serializing a User object with JSON.

User:
  name: "zhangsan"
  age: 23
  married: true

Resulting JSON string: {"name":"zhangsan","age":"23","married":"true"} Code to serialize in Java (using a JSON library):

byte[] serializedUser = JsonConvert.SerializeObject(user).getBytes("UTF-8");

If JSON performance is insufficient, binary serializers such as Kryo can be used with similar implementation effort but better speed and smaller payloads.

Example: Kryo serialization of the same User object.

kryo.register(User.class);
Output output = new Output(new FileOutputStream("file.bin"));
kryo.writeObject(output, user);
output.close();

Performance‑Focused Custom Serialization

Message‑queue (MQ) systems often require higher throughput than generic serializers provide, prompting custom binary formats. By fixing field order and omitting field names, payload size can be dramatically reduced.

Custom binary representation of the User object (illustrative):

03 | 08 7a 68 61 6e 67 73 61 6e | 17 | 01User | z h a n g s a n | 23 | true

Explanation:

First byte 03 identifies the object type (User).

Next byte 08 stores the length of the name, followed by the 8‑byte name "zhangsan".

Age is stored as a single byte 17 (hex for 23).

Marital status uses one byte: 01 for married, 00 for single.

This custom format uses 12 bytes versus 47 bytes for the JSON representation, yielding faster transmission but at the cost of readability and increased implementation complexity.

Summary

Inter‑process communication requires converting structured objects to binary data via serialization. When selecting a serializer, balance readability, implementation effort, speed, and payload size. In most cases, a high‑performance generic binary serializer (e.g., Kryo) or JSON suffices; custom binary formats should be reserved for scenarios with extreme performance or bandwidth constraints.

Interview Quick‑Q&A

Why not transmit raw in‑memory binary data directly? In‑memory representations are language‑specific (e.g., Java vs. PHP) and contain pointers and layout details that other languages cannot interpret. Serialization defines a language‑agnostic protocol, enabling cross‑language communication and persistent storage.

Key challenges of raw binary transmission:

Network byte order vs. host byte order (endianness) must be handled.

Platform differences: primitive sizes, struct alignment, and OS‑specific endianness affect compatibility.

Pointer and reference handling: objects may reference other objects via memory addresses, which are meaningless on another machine.

Addressing these issues essentially leads to building a custom serialization framework.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendJavaGoserializationMQ
JavaEdge
Written by

JavaEdge

First‑line development experience at multiple leading tech firms; now a software architect at a Shanghai state‑owned enterprise and founder of Programming Yanxuan. Nearly 300k followers online; expertise in distributed system design, AIGC application development, and quantitative finance investing.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.