Big Data 15 min read

Alluxio Tiered Metadata Management and Asynchronous Cache Eviction Implementation

The article explains Alluxio's tiered metadata management architecture, describing how the system separates hot and cold metadata into cached and persisted layers, and details the custom asynchronous eviction thread and cache implementation that replace Guava cache for efficient large‑scale metadata handling.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Alluxio Tiered Metadata Management and Asynchronous Cache Eviction Implementation

Alluxio (formerly Tachyon) is a memory‑centric distributed storage system that provides a unified data access layer between compute frameworks and underlying storage. It originated from UC Berkeley AMPLab and is widely adopted in big‑data environments.

The article focuses on Alluxio's tiered metadata management, which splits metadata into two layers: a cached layer for recently accessed hot metadata kept in memory, and a persisted layer (backed by RocksDB) for cold metadata that is rarely accessed.

Alluxio implements this design with two stores: cache store for active metadata and baking store (RocksDB) for persisted metadata. The cache only holds active entries, avoiding memory bottlenecks.

The core of the cache is the CachingInodeStore class, which creates three specialized caches ( InodeCache, EdgeCache, ListingCache) based on a CacheConfiguration that defines max size, high‑water and low‑water marks, and eviction batch size.

public final class CachingInodeStore implements InodeStore, Closeable {
  private final InodeStore mBackingStore;
  private final InodeLockManager mLockManager;
  final InodeCache mInodeCache;
  final EdgeCache mEdgeCache;
  final ListingCache mListingCache;
  private volatile boolean mBackingStoreEmpty;
  ...
}

The Cache<K,V> abstract class underlies these caches. It uses a ConcurrentHashMap to store entries and an EvictionThread to asynchronously write out stale entries when the cache exceeds its high‑water mark.

public abstract class Cache<K, V> implements Closeable {
  private final int mMaxSize;
  private final int mHighWaterMark;
  private final int mLowWaterMark;
  private final int mEvictBatchSize;
  final ConcurrentHashMap<K, Entry> mMap;
  final EvictionThread mEvictionThread;
  ...
}

The eviction thread continuously monitors the cache size. When the number of entries surpasses the high‑water mark, it wakes up, logs a warning if the cache is full, and evicts entries down to the low‑water mark.

class EvictionThread extends Thread {
  public void run() {
    while (!Thread.interrupted()) {
      while (!overHighWaterMark()) {
        synchronized (mEvictionThread) {
          if (!overHighWaterMark()) {
            try { mIsSleeping = true; mEvictionThread.wait(); mIsSleeping = false; }
            catch (InterruptedException e) { return; }
          }
        }
      }
      if (cacheIsFull()) { /* log warning */ }
      evictToLowWaterMark();
    }
  }
}

private void evictToLowWaterMark() {
  int toEvict = mMap.size() - mLowWaterMark;
  int evictionCount = 0;
  while (evictionCount < toEvict) {
    if (!mEvictionHead.hasNext()) { mEvictionHead = mMap.values().iterator(); }
    fillBatch(toEvict - evictionCount);
    evictionCount += evictBatch();
  }
  if (evictionCount > 0) { LOG.debug("{}: Evicted {} entries", mName, evictionCount); }
}

During eviction, the cache collects candidates that are not currently referenced. Clean entries are removed directly, while dirty entries are flushed to the backing store before removal.

private void fillBatch(int count) {
  int targetSize = Math.min(count, mEvictBatchSize);
  while (mEvictionCandidates.size() < targetSize && mEvictionHead.hasNext()) {
    Entry candidate = mEvictionHead.next();
    if (candidate.mReferenced) { candidate.mReferenced = false; continue; }
    mEvictionCandidates.add(candidate);
    if (candidate.mDirty) { mDirtyEvictionCandidates.add(candidate); }
  }
}

private int evictBatch() {
  int evicted = 0;
  if (mEvictionCandidates.isEmpty()) return evicted;
  flushEntries(mDirtyEvictionCandidates);
  for (Entry entry : mEvictionCandidates) {
    if (evictIfClean(entry)) { evicted++; }
  }
  mEvictionCandidates.clear();
  mDirtyEvictionCandidates.clear();
  return evicted;
}

Each cache entry is represented by an Entry object that tracks its key, value, dirty flag, and a reference flag indicating recent access.

protected class Entry {
  protected K mKey;
  @Nullable protected V mValue;
  protected volatile boolean mDirty = true;
  protected boolean mReferenced = false;
  ...
}

The put method updates the map, marks entries as referenced and dirty, and notifies the eviction thread to act promptly, avoiding the latency spikes observed with Guava's lazy cleanup.

public void put(K key, V value) {
  mMap.compute(key, (k, entry) -> {
    onPut(key, value);
    if (entry == null && cacheIsFull()) { writeToBackingStore(key, value); return null; }
    if (entry == null || entry.mValue == null) { onCacheUpdate(key, value); return new Entry(key, value); }
    entry.mValue = value;
    entry.mReferenced = true;
    entry.mDirty = true;
    return entry;
  });
  wakeEvictionThreadIfNecessary();
}

Overall, the article demonstrates how Alluxio replaces Guava cache with a custom two‑layer cache and an asynchronous eviction thread to achieve scalable metadata management for billions of files, providing both high performance for hot metadata and reliable persistence for cold metadata.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cachemetadatadistributed storageAlluxioeviction
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.