Backend Development 13 min read

How We Built a Resilient Local Cache for a High‑Performance Recommendation System

When the recommendation service experiences database disconnections, third‑party timeouts, or network jitter, we designed an off‑heap local disaster‑recovery cache using OHC and SpringBoot that isolates cache logic, writes asynchronously, backs up to disk, and restores availability, keeping latency under 10 ms and improving user experience.

Mafengwo Technology

May 17, 2019

How We Built a Resilient Local Cache for a High‑Performance Recommendation System

Background

The recommendation system in the MaFengWo app must continuously provide users with personalized content. The pipeline consists of user feature extraction, recall via various machine‑learning algorithms, and ranking before returning results to the front end. This process involves MySQL and Redis queries, REST service calls, and data processing, with strict latency requirements (average ~10 ms, 99th percentile under 1 s).

Design and Implementation

Design ideas and technology selection

To handle situations where external or internal services timeout or fail, we introduced a disaster‑recovery cache that returns cached data to the front end, reducing empty results and preserving user interest.

We chose an off‑heap cache based on OHC combined with SpringBoot for the following reasons:

1. Isolate cache logic from online services – CacheService is encapsulated and placed at the end of the existing flow, exposing read/write APIs while keeping business logic separate.

2. Asynchronous writes to improve performance – Write tasks are submitted to a ThreadPoolExecutor, allowing the main thread to continue without blocking.

3. Local cache for fast access – Recommendations do not require strong consistency, so a local (non‑distributed) cache suffices; we use the open‑source OHC library.

4. Backup cache instances for availability – In addition to in‑memory storage, cached data is periodically persisted to the file system using SpringBoot scheduled tasks and ApplicationRunner.

Overall Architecture

The existing recommendation logic remains unchanged. At the end of the flow we added a CacheModule and CacheService that handle all cache‑related operations.

Module Interpretation

1. CacheModule

After the original recommendation processing, CacheModule examines the response. If no exception occurred and the result is non‑empty, it submits a cache‑write task to CacheService. If an exception occurs, it attempts to read cached data based on the business‑scenario key and returns it to the caller.

Submit cache task – Non‑blocking submission of a key/value pair (business scenario → processed content) to CacheService.

Read cache data – When the application or a dependent service throws an exception, CacheModule retrieves cached data using the same key and returns it, unless the user has already exhausted all available content.

2. CacheService

CacheService implements the cache using OHC (an off‑heap cache extracted from Apache Cassandra) and SpringBoot utilities.

(1) Data format – Recommendations are returned per “screen” (a set of content items). The cache stores a key‑set mapping where the key represents a business scenario (e.g., homepage video channel) and the value is the collection of screens.

(2) Storage location – For Java applications, cache can reside in heap, off‑heap, or on disk. We compared these options and selected off‑heap to achieve fast read/write while avoiding GC impact on the online service.

(3) File backup – Upon application restart, off‑heap cache is empty. We use SpringBoot scheduling tasks to periodically back up the cache to the file system, and ApplicationRunner to load the backup into off‑heap at startup.

(4) API for CacheModule

Read: provide a key, and CacheService returns a random item from the associated set.

Write: package key and value into a task and submit it to the asynchronous task queue.

(5) Task queue and asynchronous write – Implemented with a JDK ThreadPoolExecutor using a LinkedBlockingQueue. The pool size is fixed at 1 (QPS < 100). When the queue is full, DiscardPolicy drops new tasks.

(6) Cache size control – To prevent memory overuse, each business scenario can be configured with a maximum cache entry count. When the limit is reached, new entries replace existing ones via random sampling.

Online Performance

We instrumented cache hits and observed hourly hit counts via Kibana. During a period of increased timeouts (18:00‑19:00), the cache mitigated impact and improved system availability.

Write latency is at the millisecond level (asynchronous), and read latency is at the microsecond level, adding negligible overhead.

Pitfalls

Before writing to OHC we serialize objects with Kryo. Classes that do not implement Serializable (e.g., List#subList internal classes) can cause deserialization failures; registering custom serializers from the kryo‑serializers repository resolves this. Additionally, improper configuration of OHC capacity and maxEntrySize can lead to write failures, so cache size should be estimated and set appropriately before deployment.

Optimization Directions

While the current cache works, we plan to improve it in three areas:

Replace random overwriting when the cache is full with an LRU‑style eviction of the oldest entries.

Increase cache granularity by using destination‑specific keys instead of a single key for all destination pages.

Migrate some MySQL‑dependent configuration data to local file‑based caches.

References:

Java Caching Benchmarks 2016 – Part 1

On Heap vs Off Heap Memory Usage

OHC – An off‑heap cache

Kryo‑serializers

SpringBoot scheduling tasks guide

Java recommendation system SpringBoot Off-Heap

Written by

Mafengwo Technology

External communication platform of the Mafengwo Technology team, regularly sharing articles on advanced tech practices, tech exchange events, and recruitment.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.