Optimizing Object Storage and Impala Engine in NetEase NDH: Performance Enhancements and Feature Additions
This presentation outlines NetEase's NDH big‑data platform, detailing its background, object‑storage upload and rename optimizations, Impala engine adaptations—including file‑handle caching, transparent URI handling, and getFileBlockLocations improvements—and a suite of operational enhancements such as dynamic proxy user configuration and audit‑log extensions.
1. Background Introduction
NetEase's Data Platform NDH is an internal implementation comparable to Cloudera CDH, widely used within the company. Its underlying OLAP engine is Impala, an MPP architecture offering strong query performance. Alluxio is employed as a distributed cache layer, significantly boosting performance and reducing cloud‑scene integration costs.
2. Object‑Storage Scenario Optimizations
Streaming Upload Optimization : Replaced the original sequential upload mechanism with a chunked, asynchronous UploadPart approach, improving I/O utilization by 40% and eliminating worker‑disk size limits.
Rename Performance Optimization : Leveraged UFS batch‑delete APIs to merge copy‑delete operations, achieving a 30% speedup for large directory renames.
OBS Deletion Performance Optimization : Utilized Huawei OBS's native rename interface and delayed delete strategy to accelerate large‑directory deletions.
3. Impala Engine Adaptation
File Handle Cache : Added support for adjusting Alluxio I/O threads and file‑handle caching in Impala. Modified the Hadoop client to implement an unbuffer method, releasing connections and reducing memory pressure.
Transparent URI : Implemented a whitelist‑based location‑prefix conversion and later a transparent URI mechanism, allowing selective Alluxio acceleration without engine‑specific code changes.
getFileBlockLocations Optimization : Removed redundant RPC calls by using Alluxio’s embedded block location data, reducing metadata loading latency.
4. General Feature Enhancements
Data Asset Service : Developed an Ratis listener that writes Alluxio metadata to HBase, enabling real‑time meta queries and offline analysis via Spark.
Hadoop Ecosystem Compatibility – Recycle Bin & Directory Freeze : Added recycle‑bin‑like cleanup and directory‑freeze capabilities to prevent accidental deletions.
Operational Improvements : Implemented dynamic proxy‑user configuration, UFS performance metrics, enhanced audit logs with client version info, file‑upload awareness commands, and cache‑behavior controls for free/load operations.
Conclusion and Outlook
NetEase NDH, using Alluxio as a unified storage entry, serves customers on AWS, Alibaba Cloud, and Huawei Cloud. The team actively contributes to the open‑source community, having submitted over 40 PRs to Alluxio.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.