Boost Recommendation Engine Performance with Off‑Heap Cache (OHC) in Java

This article explains the principles and practical implementation of the Java off‑heap cache framework OHC, detailing its architecture, memory allocation, serialization, configuration, and real‑world performance results within the MaFengWo recommendation engine, illustrating how it reduces latency and improves cache hit rates.

Mafengwo Technology
Mafengwo Technology
Mafengwo Technology
Boost Recommendation Engine Performance with Off‑Heap Cache (OHC) in Java

Part 1: Introduction to OHC

In recommendation systems, the engine performs recall and ranking stages that require massive data reads; fast data access is crucial for performance. Caching is widely used in enterprise web systems to reduce network latency by storing frequently accessed database results locally.

OHC (off‑heap cache) is a Java‑based key‑value cache library that runs in a single‑process, off‑heap mode. Originally developed for Apache Cassandra in 2015, it is now an independent library (https://github.com/snazy/ohc).

1. Heap vs. Off‑Heap

Java heap memory is managed by the JVM garbage collector (GC), which can pause application threads during collection. Heap‑based caches (e.g., HashMap) increase GC overhead when large. Off‑heap memory is allocated and freed by the application itself, avoiding GC impact and benefiting large caches (multi‑gigabyte scale).

2. OHC Features

Data stored off‑heap, does not affect GC.

Per‑entry expiration support.

Configurable eviction policies (LRU, W‑TinyLFU).

Can hold millions of entries.

Asynchronous loading.

Read/write latency in microseconds.

These characteristics make OHC suitable for the high‑throughput, low‑latency needs of a recommendation engine.

3. Usage Example

Typical steps to use OHC in a Java project:

Add OHC dependency to the Maven POM.

Implement org.caffinitas.ohc.CacheSerializer to serialize/deserialize objects.

Pass the serializer to the OHCache constructor.

Use get and put methods for cache operations.

A demo project is available at https://github.com/chebacca/ohc-example.

Part 2: OHC Implementation

1. Overall Architecture

OHC exposes the org.caffinitas.ohc.OHCache interface. Two implementations exist:

org.caffinitas.ohc.chunked.OHCacheChunkedImpl
org.caffinitas.ohc.linked.OHCacheLinkedImpl

The linked implementation stores each key‑value pair in a separate off‑heap block, suitable for medium‑to‑large entries, and is the one used in production.

2. OHCacheLinkedImpl Details

Key components:

Segment array: OffHeapLinkedMap[] Serializer/Deserializer: CacheSerializer Operations:

Compute key hash and locate the segment.

Retrieve the off‑heap pointer for the entry.

For get, read the byte array from off‑heap and deserialize.

For put, serialize the object to a byte array and write it to the allocated off‑heap memory.

3. Segment Implementation (OffHeapLinkedMap)

Each segment contains multiple buckets; each bucket is a linked list of off‑heap pointers. Lookup proceeds by hashing to a bucket, then linearly scanning the list.

Example layout (illustrated in the image below) shows two buckets with four key‑value pairs and their off‑heap addresses.

4. Space Allocation

OHC provides two allocators: JNANativeAllocator (uses Native.malloc) and UnsafeAllocator (uses Unsafe.allocateMemory). Each entry occupies:

off‑heap size = aligned key size + value size + 64 bytes metadata

Part 3: OHC in MaFengWo Recommendation Engine

1. Engine Workflow

The engine performs recall, ranking, and re‑ranking, each requiring thousands of items and hundreds of features per item. Local caching of these features dramatically reduces network latency.

2. Data Types Stored in OHC

Offline features (e.g., daily click‑through rates) are updated hourly or daily and are ideal for OHC caching, avoiding repeated Redis reads. Real‑time features are kept in Redis with short TTLs to maintain freshness. Small hot data is cached in Guava (heap).

3. Serialization Choice

Keys are String; values are Object. Keys are serialized to UTF‑8 bytes; values use Kryo (wrapped in ThreadLocal because Kryo is not thread‑safe). Consistency between CacheSerializer#serializedSize and #serialize is essential; mismatched size estimates can waste off‑heap memory.

4. Production Configuration

Key tuning parameters:

Total capacity: grew from ~4 GB to ~10 GB to cover hot data.

Segment count: balanced to reduce lock contention while limiting heap metadata overhead.

Hash algorithm: CRC32C chosen for low CPU usage.

Eviction policy: LRU, given stable workload and low churn.

5. Online Performance

With a 10 GB off‑heap cache, the engine stores millions of entries, achieving >95 % hit rate. Average get latency is ~20 µs, put latency ~100 µs. Entry size limits are enforced via org.caffinitas.ohc.maxEntrySize to avoid oversized objects.

6. Optimizations in Practice

(1) Asynchronous expiration removal : expired entries are queued and cleaned by a background thread instead of blocking the read path.

(2) Lock refinement : switched from TAS (test‑and‑set) to TTAS (test‑test‑and‑set) locks, reducing contention and improving throughput.

Conclusion

The article presented OHC’s design, off‑heap memory management, and its successful deployment in MaFengWo’s recommendation engine. OHC offers low latency, GC‑independent caching with configurable eviction and expiration, making it well‑suited for storing large offline feature sets while preserving real‑time data freshness through complementary Redis and Guava caches.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaPerformance OptimizationRecommendation Engineoff-heap cachebackend caching
Mafengwo Technology
Written by

Mafengwo Technology

External communication platform of the Mafengwo Technology team, regularly sharing articles on advanced tech practices, tech exchange events, and recruitment.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.