Backend Development 14 min read

How to Compress Large Java int/long Arrays for Massive Memory Savings

This article explains how to reduce memory usage of massive Java int/long arrays by applying real‑time compression, eliminating redundancy, using indexed buckets, offset storage, and a series of low‑level optimizations that boost TPS from dozens to over a thousand while preserving random‑access capabilities.

Alibaba Cloud Developer

Jul 12, 2021

How to Compress Large Java int/long Arrays for Massive Memory Savings

Data Compression

When developing backend applications or middleware, large amounts of data are often kept in memory to speed up access. As the data volume grows, storing it in the heap becomes costly, so real‑time compression of integer arrays is introduced.

1. Removing Redundancy

Instead of always using the full 4‑byte or 8‑byte representation of int or long, we can drop unused high‑order bytes dynamically. By borrowing a few bits from the original value to indicate the number of bytes used, we can store most numbers in fewer bytes.

We represent the byte count using either extra bits of the original number or the highest bit of each byte, similar to Facebook’s Thrift compression. The goal is to compress long[] or int[] into a byte[] and later reconstruct the original numbers.

2. Offset Storage

After compression, we still need random‑access by index. Because compressed numbers have variable length, we cannot compute an address directly. We solve this by building an index:

Divide the data into adjustable buckets (e.g., 1 KB or 4 KB each).

Maintain a secondary array that stores the starting index of the first element in each bucket.

During lookup, first binary‑search the bucket, then linearly scan inside the bucket.

2. Access Optimizations

1. Variable‑Length Buckets

Switching from fixed‑size to variable‑length buckets allows us to compute the bucket index in O(1) using division and remainder, replacing the previous O(log n) binary search. This change raised batch‑interface TPS from ~72 to ~320.

2. Dedicated Iterator

For sequential batch reads we create a custom iterator that records the current bucket position and advances by the known length of each element, achieving O(1) per step and raising TPS to ~680.

3. Reduce Intermediate Data – Use Stack Directly

Instead of creating temporary ByteBuffer objects, we reference the original bucket data directly and pass it via the stack to the decompression routine, cutting overhead and improving performance by ~45% (TPS ≈ 910).

4. Move Heap Data to Stack

We eliminate heap‑allocated temporary arrays by operating on primitive long values directly, using bit‑shifts and masks to assemble the value. This removes the need for byte[] num and further boosts TPS by ~35% (≈ 1250).

longData = (longData & ~ (0xffL << (2 * 8))) | (0x02L << (2 * 8));

5. Inline Short Functions

Small helper methods such as updateOffset() or Byte.toUnsignedInt() are inlined, removing call overhead and contributing to the final performance gains.

3. Performance Results

After all optimizations, the batch interface TPS increased from 72 to about 1380, roughly a 20‑fold improvement. Single‑call random access reaches ~260 M TPS, and sequential single‑call access reaches ~6500 M TPS, comfortably meeting production demands.

4. Optimization Summary

Primitive data structures are far faster than object‑based ones; low‑level access wins.

Heap allocations are slow and trigger frequent Young GC; prefer stack allocations.

Method calls add stack‑frame overhead; inlining simple logic yields measurable gains.

Minimize intermediate data and temporary variables.

Even tiny inefficiencies explode under massive call volumes; batch paths must be meticulously optimized.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance Memory optimization Array data compression

Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.