How Speculative Path Resolution Cuts Metadata Latency in InfiniFS
This article explains InfiniFS's speculative path resolution, detailing how predictable directory IDs and parallel lookups transform traditional linear RPC-based path traversal into constant‑time operations, dramatically reducing metadata access latency in large, deep directory trees.
1. Predictable Directory IDs
This section describes how InfiniFS generates and maintains directory IDs that can be predicted from the pathname.
1) Creating
When a directory is created, InfiniFS hashes its "birth triple" <parent directory ID, directory name, name version> to produce a unique directory ID, as illustrated below.
The parent directory at creation time is called the "birth parent directory". InfiniFS uses a version number to guarantee the uniqueness of the birth triple.
2) Renaming
Renaming a directory only updates the key in its access metadata; the content metadata and ID remain unchanged, so all descendant directories keep their metadata unchanged.
When a directory is renamed for the first time, its birth parent records a "rename list" (RL) <directory name, name version>, and the directory itself records a "back‑pointer" (BP) <birth parent ID, name version>.
RL records child directories that were "born" under this directory but later moved away. BP points to the birth parent directory. By consulting the RL of the birth parent, InfiniFS determines the new name version.
The diagram below shows the first rename of /A/B to /B. The access‑metadata key of B changes from 2:B to 1:B, while B’s ID stays the same. A records RL<B,0> and B records BP<2,0>. If a new B is later created under /A, its name version will be 1 because the RL indicates a previous B was removed.
Further renames only update the access‑metadata key; content metadata, BP, and the birth parent’s RL remain unchanged. Deleting a renamed directory clears its access and content metadata, but the RL is retained until the renamed directory itself is removed, at which point the RL entry is cleared via the BP.
3) Deleting
When a renamed directory is deleted, InfiniFS uses its BP to remove the corresponding entry from the birth parent’s RL, as shown in the diagram. After deletion, creating a new B under /A resets the name version to the default 0.
ID Uniqueness
Predictable directory IDs are globally unique; hash collisions are extremely rare and InfiniFS can detect and handle them.
Directory IDs are generated by hashing the birth triple. As long as the birth triple is unique, the ID is unique, unless a hash collision occurs.
The file‑system semantics forbid two directories with the same name under the same parent at any moment. Without renaming, the pair <birth parent ID, directory name> is sufficient for uniqueness. After renaming, the same‑name directory can be created under the same parent; a version number resolves this, ensuring each birth triple remains unique. InfiniFS uses a cryptographic hash (e.g., SHA‑256) to generate IDs, making collisions negligible. Since the directory ID is the primary key of the content metadata, conflicts are detected at creation time, and the same version‑number, RL, and BP mechanisms used for renames also handle hash collisions.
2. Parallel Path Resolution
Based on the predictable directory IDs introduced above, the client can parallelize path resolution in two steps:
1) Predict Directory IDs: The client reconstructs the birth triple for each intermediate directory using version 0, hashes it to obtain the predicted ID, and then builds the keys for all components of the path.
2) Parallel Lookup: The client sends lookup requests for all intermediate directories in parallel. Each request checks permissions and compares the predicted ID with the server’s actual ID. If they differ, the server returns the true ID.
3) Repeat steps 1 and 2 until the entire path is resolved.
The diagram below illustrates the speculative path resolution mechanism. If an intermediate directory has been renamed, its speculative ID will be incorrect (e.g., h(2,X,0)≠12). The lookup request can still retrieve the correct access metadata using the proper key 2:X, obtain the real ID, and the client continues resolving the sub‑path under X.
Big Data Technology Tribe
Focused on computer science and cutting‑edge tech, we distill complex knowledge into clear, actionable insights. We track tech evolution, share industry trends and deep analysis, helping you keep learning, boost your technical edge, and ride the digital wave forward.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
