Big Data 12 min read

Apache Hudi Asia Summit Successfully Held

The first Apache Hudi Asia Summit in Beijing attracted over 230 attendees, featuring technical discussions on data lake optimization and case studies from companies like Fastly and Meituan.

Kuaishou Tech
Kuaishou Tech
Kuaishou Tech
Apache Hudi Asia Summit Successfully Held

The first Apache Hudi Asia Summit in Beijing attracted over 230 attendees, featuring technical discussions on data lake optimization and case studies from companies like Fastly and Meituan.

Key topics included Apache Hudi's 1.0 version upgrades, such as storage format optimization, index system restructuring, and incremental processing capabilities. Speakers from Fastly, Meituan, and other companies shared best practices and technical implementations.

Fastly's data architecture team discussed AI and BI scenario implementations, highlighting the use of full-linkage vectorization, real-time subscription, and logical wide table column concatenation to optimize data organization and reduce costs.

Meituan's Beluga architecture was presented, focusing on a "one table three modes" approach that combines row-based HFile for stream writing and column-based Parquet for batch processing, improving data processing efficiency and reducing operational complexity.

Douyin's SampleCenter platform was showcased, addressing challenges in EB-level recommendation data scenarios through unified lake storage, real-time sample and tag concatenation, and dynamic bucket strategies.

Huawei's optimizations included reducing GC impact by storing raw row data as byte arrays, improving stream reading performance, and implementing column cluster concepts for sparse matrix storage.

Jingdong's data lake architecture introduced a multi-model storage approach, combining HDFS, Kafka/HBase, and other storage systems for seamless integration and enhanced data processing capabilities.

The summit concluded with a call for continued community collaboration and innovation in data lake technologies.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

data engineeringBig Datadata optimizationData LakeTechnical ConferenceApache Hudi
Kuaishou Tech
Written by

Kuaishou Tech

Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.