Tagged articles
2 articles
Page 1 of 1
Big Data Technology Architecture
Big Data Technology Architecture
May 19, 2020 · Big Data

An Overview of Apache Parquet: Architecture, Features, and Comparison with ORC

Apache Parquet is a language‑agnostic, columnar storage format for the Hadoop ecosystem that offers high compression, efficient I/O through column and predicate push‑down, nested‑structure support, and a three‑layer architecture, and is compared with ORC while providing tooling for schema inspection.

Apache HadoopColumnar StorageData Formats
0 likes · 9 min read
An Overview of Apache Parquet: Architecture, Features, and Comparison with ORC