Big Data 11 min read

How eBay Scaled HDFS to 800 PB Using Federation and Router‑Based Architecture

This article details eBay's evolution of its massive HDFS storage—from a single‑cluster design to ViewFS Federation, then to Router‑Based Federation—highlighting the performance bottlenecks, optimization techniques, FastCopy integration, and future plans for further scaling and automation.

ITPUB

May 7, 2022

How eBay Scaled HDFS to 800 PB Using Federation and Router‑Based Architecture

eBay HDFS Scale and Challenges

eBay operates more than 10 HDFS clusters with over 20,000 nodes. The largest cluster contains >5,000 machines, stores >800 PB of data and processes >100 k MapReduce/YARN jobs per day. Rapid data growth created three major pressure points:

Continuous increase in both file data and metadata storage.

Single‑point performance bottleneck at the NameNode.

Operational complexity of managing many independent clusters.

Early Performance Optimizations (single‑cluster mode)

Before federation, eBay tuned the original NameNode to improve throughput:

Heavy API call reduction : avoided costly Balancer block queries to standby NameNode, limited batch‑size deletes, used ListStatus without block locations, and split snapshot directories.

Asynchronous RPC response handling (JIRA HDFS‑15486): moved encryption of RPC responses to a separate thread, freeing the NameNode handler thread pool.

Namespace lock refinements (JIRA HDFS‑15553): removed redundant directory locks, converted SetTimes write locks to read locks, and introduced a read‑write call‑queue.

ViewFS‑Based Federation

To break the NameNode bottleneck, eBay introduced a ViewFS federation in 2019. Metadata was split into multiple namespaces, each served by its own NameNode. This horizontal scaling increased overall metadata throughput but introduced two management drawbacks:

High maintenance cost – every new namespace required client‑side configuration updates.

Lack of client transparency – data migration between namespaces forced application changes.

FastCopy Data‑Migration Process

FastCopy was built to move large volumes of data between federated namespaces without copying block bytes.

Revoke directory permissions.

Close any open file handles.

Execute FastCopy to create hard‑links for blocks.

Retry on failure.

Restore permissions.

Update the ViewFS mount‑table.

FastCopy workflow:

Client queries the source NameNode for block locations.

Target NameNode creates the destination file and block entries.

A copyBlock request is sent to the owning DataNode.

The DataNode creates a hard link to the existing block file.

The DataNode reports the new block to the target NameNode, completing the migration.

Integration with DistCp

FastCopy logic was embedded into the Hadoop DistCp tool, adding:

Uniform handling of large and small files to avoid long‑tail tasks.

ACL preservation before copy.

Support for whitelist / exclude lists.

Job‑parameter tuning for optimal parallelism.

These changes yielded up to a 7× speedup for bulk data moves.

Router‑Based Federation (RBF)

In 2021 eBay migrated to the community Router‑Based Federation architecture. A stateless Router service sits between clients and the set of federated NameNodes, providing:

Transparent namespace routing – clients see a single logical namespace.

Cloud‑native deployment and horizontal scalability.

Foundation for data‑split strategies built on the Router layer.

RBF Request Flow

Clients send all HDFS RPCs to the Router. The Router forwards metadata‑related calls to the appropriate NameNode(s), aggregates responses, and returns a unified result to the client.

Secure Token Workflow (YARN RM security)

Client obtains a delegation token from the source NameNode.

Client submits a YARN job to the ResourceManager (RM) carrying the token.

RM requests a Router‑level token and distributes it to all worker nodes.

Worker nodes use the Router token for HDFS authentication.

After job completion, RM revokes the token.

RBF Performance Enhancements

Increased Router RPC throughput; removed SASL encryption between Router and NameNode to reduce latency.

Preserved client IP and clientId in RPCs, maintaining data‑locality awareness for YARN tasks.

Fixed moveToTrash semantics in multi‑mount scenarios.

Future Work

Planned improvements for the RBF layer include:

Asynchronous RPC handling inside the Router to further boost throughput.

Automated data‑split generation based on namespace usage patterns.

Tiered storage integration to move cold data to cheaper media.

Isolation of inter‑namespace RPC traffic for better security and performance.

Key Diagrams

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Optimization HDFS Router-based Federation Federation eBay

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.