Tagged articles

ORC

7 articles · Page 1 of 1
vivo Internet Technology
vivo Internet Technology
Mar 26, 2025 · Big Data

Reading Encrypted ORC Files in StarRocks: Architecture and Implementation Details

The article details how StarRocks extends the Apache ORC C++ library to decrypt column‑level encrypted ORC files, describing the file hierarchy, AES‑128‑CTR key handling, the query‑time master‑key retrieval, a decorator‑based decryption/decompression pipeline, and the block‑skip‑read mechanism that enables efficient predicate push‑down.

Big DataEncryptionFile Format
0 likes · 19 min read
Reading Encrypted ORC Files in StarRocks: Architecture and Implementation Details
Past Memory Big Data
Past Memory Big Data
Jun 20, 2024 · Big Data

How Meituan Scaled Spark with Vectorized Execution Using Gluten + Velox

This article details Meituan's production‑grade adoption of Spark vectorized execution via the open‑source Gluten and Velox stack, explaining SIMD fundamentals, performance motivations, the end‑to‑end integration workflow, staged rollout, encountered challenges, and the resulting resource savings and speedups.

Big DataGlutenORC
0 likes · 33 min read
How Meituan Scaled Spark with Vectorized Execution Using Gluten + Velox
StarRocks
StarRocks
Feb 29, 2024 · Databases

How a Student Built an ORC Chunk Writer for StarRocks: Insights from Open Source Summer

In this interview, graduate student Sun Yinzhen shares how he selected, designed, and implemented an ORC Chunk Writer for the StarRocks database during the Open Source Summer program, detailing the technical challenges, learning outcomes, and his perspective on open‑source contributions for computer science students.

ORCStarRocksStudent Contribution
0 likes · 12 min read
How a Student Built an ORC Chunk Writer for StarRocks: Insights from Open Source Summer
dbaplus Community
dbaplus Community
Jan 5, 2021 · Big Data

How Ctrip Built a Scalable Unified Log Framework for Payment Data

Facing massive, heterogeneous logs from numerous payment services, Ctrip’s data team designed a unified logging framework that extends log4j2, streams logs via Kafka to HDFS using a customized Camus pipeline, partitions and stores data in ORC for efficient Hive analysis, while addressing format, storage, and performance challenges.

Big DataCamusHadoop
0 likes · 16 min read
How Ctrip Built a Scalable Unified Log Framework for Payment Data
Ctrip Technology
Ctrip Technology
Sep 10, 2020 · Big Data

Design and Implementation of a Unified Log Framework for Ctrip Payment Center

The article describes the design, architecture, and operational details of a unified logging framework at Ctrip's payment center, covering log production via a Log4j2 extension, Kafka‑Camus collection, Hive/ORC storage, MapReduce parsing optimizations, and governance strategies for massive daily TB‑scale data.

Big DataCamusData Governance
0 likes · 15 min read
Design and Implementation of a Unified Log Framework for Ctrip Payment Center