Past Memory Big Data
Dec 27, 2024 · Big Data
How Uber Cuts Storage Costs with ZSTD Compression in Apache Parquet
Uber’s data lake on Hadoop stores hundreds of petabytes in Parquet files and, by adopting ZSTD compression, column pruning, and column reordering, achieves up to 79% storage reduction and significant vCore savings, with detailed benchmarks guiding optimal compression levels and open‑source contributions.
Apache ParquetBig DataHadoop
0 likes · 14 min read
