Tagged articles
1 articles
Page 1 of 1
dbaplus Community
dbaplus Community
May 26, 2016 · Big Data

Mastering Apache Parquet: Columnar Storage, Nested Data, and Performance Gains

This article explains Apache Parquet’s columnar storage format, its support for nested data models, the underlying striping/assembly algorithm, file structure, push‑down optimizations, and performance advantages within the Hadoop ecosystem, providing a comprehensive guide for big‑data practitioners.

Apache ParquetBig DataHadoop
0 likes · 22 min read
Mastering Apache Parquet: Columnar Storage, Nested Data, and Performance Gains