Tagged articles
75 articles
Page 1 of 1
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 27, 2026 · Databases

How DuckDB Compression Supercharges AliSQL Storage and Cuts MySQL Costs

AliSQL integrates DuckDB as its storage engine to achieve high‑density columnar compression and fast analytical scans, detailing DuckDB’s multi‑layer storage format, adaptive compression algorithm selection, performance benchmarks versus InnoDB, HBase, ClickHouse, OceanBase, and the engineering optimizations AliSQL adds for throughput and cost reduction.

AliSQLColumnar StorageDatabase Optimization
0 likes · 12 min read
How DuckDB Compression Supercharges AliSQL Storage and Cuts MySQL Costs
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Dec 24, 2025 · Big Data

How Paimon’s Column‑Separation Architecture Powers Real‑Time Multi‑Modal Lakehouse for AI

This article explains the challenges of frequent column changes in AI feature engineering, introduces Paimon’s column‑separation storage with a global continuous Row ID, details its Blob data type for efficient multi‑modal handling, and outlines production results and future roadmap for building an AI‑native data lakehouse.

Apache PaimonBig DataBlob
0 likes · 11 min read
How Paimon’s Column‑Separation Architecture Powers Real‑Time Multi‑Modal Lakehouse for AI
Data STUDIO
Data STUDIO
Dec 5, 2025 · Big Data

Why Parquet Is the Default Choice for Big Data Storage

The article explains how Apache Parquet’s columnar layout, multi‑level row‑group structure, projection and predicate push‑down, and advanced compression and encoding make it the high‑performance, space‑efficient storage format that powers modern big‑data ecosystems and tools like Spark, Python pandas, and ClickHouse.

Big DataClickHouseColumnar Storage
0 likes · 11 min read
Why Parquet Is the Default Choice for Big Data Storage
ITPUB
ITPUB
Oct 11, 2025 · Databases

How OceanBase Achieves Real‑Time HTAP: Inside Its Unified Storage and Vectorized Engine

This article details OceanBase's evolution from a distributed OLTP system to a unified HTAP database, covering its cost‑based optimizer, vectorized execution, integrated row‑column storage, bypass import, materialized views, external tables, full‑text search, and real‑world use cases for real‑time analytics.

Columnar StorageHTAPOceanBase
0 likes · 12 min read
How OceanBase Achieves Real‑Time HTAP: Inside Its Unified Storage and Vectorized Engine
JD Tech Talk
JD Tech Talk
Sep 2, 2025 · Databases

Unlock ClickHouse’s Secret Weapons: The 9 Techniques Behind Lightning‑Fast Queries

This article explores ClickHouse’s high‑performance OLAP architecture, covering its MPP design, columnar storage, vectorized execution, pre‑sorting, table engines, data types, sharding and replication strategies, as well as index designs that together enable rapid analysis of massive datasets.

ClickHouseColumnar StorageVectorized Execution
0 likes · 15 min read
Unlock ClickHouse’s Secret Weapons: The 9 Techniques Behind Lightning‑Fast Queries
JD Cloud Developers
JD Cloud Developers
Sep 2, 2025 · Databases

Unlocking ClickHouse’s Lightning‑Fast Queries: The ‘Nine Swords’ Architecture Explained

This article explores ClickHouse’s high‑performance OLAP design—including its MPP architecture, columnar storage, vectorized execution, pre‑sorting, sharding, replication, index strategies, and compute engine—showing how each innovation contributes to ultra‑fast, scalable data analysis in the big‑data era.

ClickHouseColumnar StorageOLAP
0 likes · 14 min read
Unlocking ClickHouse’s Lightning‑Fast Queries: The ‘Nine Swords’ Architecture Explained
Tech Freedom Circle
Tech Freedom Circle
Sep 1, 2025 · Databases

How ClickHouse Executes GROUP BY and Handles Real‑Time Analytics on Billions of Rows

This article explains ClickHouse’s core architecture—including its storage‑compute integration, MPP parallelism, columnar storage, vectorized execution, data pre‑sorting, table engines, sparse and auxiliary indexes, and the two‑stage aggregation pipeline—then walks through the exact GROUP BY execution flow for both local and distributed tables, illustrating each step with diagrams, SQL demos, and code snippets.

ClickHouseColumnar StorageDistributed Query
0 likes · 29 min read
How ClickHouse Executes GROUP BY and Handles Real‑Time Analytics on Billions of Rows
JD Tech
JD Tech
May 13, 2025 · Databases

Unlock ClickHouse’s Lightning‑Fast Queries: Architecture, Storage, and Index Secrets

This article examines ClickHouse’s high‑performance OLAP design, covering its MPP architecture, columnar storage, vectorized execution, pre‑sorting, table engines, extensive data‑type system, sharding and replication strategies, as well as its sparse and skip‑index mechanisms that together enable ultra‑fast analytics on massive datasets.

Big DataClickHouseColumnar Storage
0 likes · 16 min read
Unlock ClickHouse’s Lightning‑Fast Queries: Architecture, Storage, and Index Secrets
JD Retail Technology
JD Retail Technology
Apr 8, 2025 · Databases

ClickHouse Architecture and Core Technologies Overview

ClickHouse is an open‑source, massively parallel, column‑oriented OLAP database that integrates its own columnar storage, vectorized batch processing, pre‑sorted data, diverse table engines, extensive data types, sharding with replication, sparse primary‑key and skip indexes, and a multithreaded query engine, delivering high‑throughput real‑time analytics on massive datasets.

Big DataClickHouseColumnar Storage
0 likes · 15 min read
ClickHouse Architecture and Core Technologies Overview
JD Tech Talk
JD Tech Talk
Dec 26, 2024 · Databases

Using ClickHouse for Efficient Tag Bitmap Storage and Group Computation in a CDP

This article explains how ClickHouse’s columnar storage, bitmap functions, and distributed architecture can be leveraged to store billions of tag bitmaps, combine them efficiently, and support fast group calculations for customer data platforms, while addressing data‑warehouse integration, storage format, and performance challenges.

BitmapColumnar StorageOLAP
0 likes · 10 min read
Using ClickHouse for Efficient Tag Bitmap Storage and Group Computation in a CDP
Tencent Cloud Developer
Tencent Cloud Developer
Nov 1, 2024 · Databases

How TDSQL Dominated Global OLAP & OLTP Benchmarks: Inside the Technical Secrets

Tencent Cloud's TDSQL shattered world records in both TPC‑DS (OLAP) and TPC‑C (OLTP) benchmarks, achieving a 7260 M QphDS score at a cost of 37.52 CNY/kQphDS, and the article explains the three self‑developed technologies—MPP execution, parallel execution framework, and columnar‑vectorized engine—that made this performance possible.

Columnar StorageDatabase PerformanceMPP
0 likes · 7 min read
How TDSQL Dominated Global OLAP & OLTP Benchmarks: Inside the Technical Secrets
ITPUB
ITPUB
Aug 29, 2024 · Databases

How TeleDB Evolved from Centralized to Native Distributed Architecture

TeleDB’s journey from a centralized MySQL/PostgreSQL‑based system to a native distributed HTAP database showcases innovations such as share‑nothing architecture, columnar storage, vectorized execution, Remote Data Access, global caching, and advanced dead‑lock detection, dramatically improving query performance, storage efficiency, and scalability.

Columnar StorageHTAPTeleDB
0 likes · 13 min read
How TeleDB Evolved from Centralized to Native Distributed Architecture
21CTO
21CTO
Jul 30, 2024 · Databases

How Database Architectures Evolved Over 20 Years: From Columnar to Cloud & Beyond

This article surveys two decades of database system architecture innovations—including columnar stores, cloud databases, data lakes, NewSQL, hardware accelerators, and blockchain databases—highlighting their motivations, trade‑offs, and the shifting landscape that shapes modern DBMS design.

Columnar StorageDBMSNewSQL
0 likes · 23 min read
How Database Architectures Evolved Over 20 Years: From Columnar to Cloud & Beyond
vivo Internet Technology
vivo Internet Technology
Jul 10, 2024 · Databases

HBase Optimization Practice in Vivo's Unified Content Platform

Vivo's unified content platform replaced its unwieldy 60 TB MongoDB store with HBase, then upgraded the cluster, introduced table‑specific connection pools, column‑only reads, tuned compaction, and leveraged multi‑version cells, cutting response times from seconds to under ten milliseconds and dramatically lowering operational costs while boosting read/write performance.

Columnar StorageCompaction OptimizationDatabase Optimization
0 likes · 16 min read
HBase Optimization Practice in Vivo's Unified Content Platform
DataFunSummit
DataFunSummit
Jun 21, 2024 · Big Data

Building a Complete Data System with Apache Arrow: Architecture, Dynamic Schema Modeling, and Practical Tips

This article explains why new data systems are needed, introduces Apache Arrow and its columnar in‑memory format, describes dynamic read‑time modeling, outlines the system’s execution flow, storage and indexing strategies, and shares practical tips and extensions for building scalable big‑data solutions.

AceroApache ArrowBig Data
0 likes · 20 min read
Building a Complete Data System with Apache Arrow: Architecture, Dynamic Schema Modeling, and Practical Tips
DataFunSummit
DataFunSummit
Apr 23, 2024 · Big Data

Building a Data System with Apache Arrow: Design, Implementation, and Practical Tips

This article explains why new data systems are needed, introduces Apache Arrow’s columnar in‑memory format and its zero‑copy advantages, describes how to model data at read time, outlines the execution flow with Acero and SQL planning, and shares practical tips and extensions for building robust, dynamic‑schema data platforms.

AceroApache ArrowBig Data
0 likes · 20 min read
Building a Data System with Apache Arrow: Design, Implementation, and Practical Tips
Sohu Tech Products
Sohu Tech Products
Mar 6, 2024 · Big Data

Building Data Systems with Apache Arrow: Architecture, Memory Format, and Execution

The article explains how Apache Arrow’s columnar, cross‑language in‑memory format enables high‑performance, interoperable data systems—replacing traditional row‑oriented databases—by supporting dynamic schemas, zero‑copy data exchange, efficient indexing, Acero‑based query execution, and Flight/ADBC connectivity, while offering practical guidance and highlighting challenges.

Apache ArrowBig DataColumnar Storage
0 likes · 20 min read
Building Data Systems with Apache Arrow: Architecture, Memory Format, and Execution
DataFunTalk
DataFunTalk
Feb 28, 2024 · Big Data

Building a Data System with Apache Arrow: Design, Modeling, and Execution

This article explains why new data systems are needed, introduces Apache Arrow and its columnar in‑memory format, describes read‑time modeling and dynamic schema handling, and shows how Arrow can be used to build a complete data processing pipeline with indexing, SQL planning, and zero‑copy data exchange.

Apache ArrowBig DataColumnar Storage
0 likes · 20 min read
Building a Data System with Apache Arrow: Design, Modeling, and Execution
DataFunTalk
DataFunTalk
Jan 1, 2024 · Big Data

MaxCompute Semi-Structured Data: Concepts, Solutions, and Benefits

This article explains the nature of semi‑structured data, compares traditional schema‑on‑read and schema‑on‑write approaches, and details MaxCompute's columnar storage solution that balances flexibility, performance, and cost for large‑scale data warehouses.

Big DataColumnar StorageData Warehouse
0 likes · 19 min read
MaxCompute Semi-Structured Data: Concepts, Solutions, and Benefits
DataFunTalk
DataFunTalk
Dec 11, 2023 · Databases

Interview with Wu Li on Columnar Storage, JIT Compilation, and Push Mode in Modern Database Systems

The interview with Wu Li, a research engineer at Shanghai Yanhuang Data, explores how columnar storage, JIT compilation, and push-mode processing are reshaping modern database performance, highlighting hardware constraints, software optimizations, and product‑centric goals in the era of big data analytics.

Columnar StorageJIT CompilationOLAP
0 likes · 11 min read
Interview with Wu Li on Columnar Storage, JIT Compilation, and Push Mode in Modern Database Systems
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Nov 17, 2023 · Databases

How openGemini’s New Columnar Engine Solves High‑Cardinality Time‑Series Challenges

This article explains why time‑series databases are ideal for massive telemetry data, describes the high‑cardinality problem that degrades performance, and shows how openGemini’s newly introduced columnar engine—combined with sorting and clustering indexes—effectively mitigates those issues while delivering impressive write and query speeds.

Columnar Storagedatabaseshigh-cardinality
0 likes · 7 min read
How openGemini’s New Columnar Engine Solves High‑Cardinality Time‑Series Challenges
DataFunSummit
DataFunSummit
Jul 9, 2023 · Big Data

Data Governance and Application for Behavior Analysis: Modeling Methods, Architecture, and Practical Cases

This article explains how a data‑ecosystem team governs and applies behavior‑analysis data by describing common analysis scenarios, data‑warehouse modeling methods and their pros and cons, the concepts and overall architecture of behavior‑centric analytics, key system components, and several concrete analysis examples such as retention, funnel and path analysis.

Big DataColumnar StorageUser Segmentation
0 likes · 12 min read
Data Governance and Application for Behavior Analysis: Modeling Methods, Architecture, and Practical Cases
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 16, 2023 · Big Data

How SLS’s Schema‑on‑Read Scanning Boosts Log Analytics Flexibility and Cuts Costs

This article explains the motivation, design, and implementation of Alibaba Cloud's SLS Schema‑on‑Read scanning mode, showing how it enables SQL analysis on raw log data without pre‑built indexes, improves flexibility for evolving schemas, and reduces storage and index costs in various log‑analysis scenarios.

Big DataColumnar StorageCost Optimization
0 likes · 27 min read
How SLS’s Schema‑on‑Read Scanning Boosts Log Analytics Flexibility and Cuts Costs
ITPUB
ITPUB
Dec 18, 2022 · Databases

Why ClickHouse Is So Fast: Deep Dive into Storage and Compute Engine Optimizations

This article explains how ClickHouse achieves high query performance by leveraging storage‑engine designs such as pre‑sorting, columnar layout, and block‑level compression, and by exploiting a vectorized compute engine while avoiding joins and using built‑in functions.

Big DataClickHouseColumnar Storage
0 likes · 9 min read
Why ClickHouse Is So Fast: Deep Dive into Storage and Compute Engine Optimizations
Architects' Tech Alliance
Architects' Tech Alliance
Nov 20, 2022 · Databases

Columnar Storage vs Row Storage: Overview, Write/Read Comparison, Pros, Cons, and Use Cases

This article explains the differences between row-based and column-based storage, comparing their write and read performance, outlining advantages and disadvantages, and describing suitable scenarios such as OLAP queries, column families, compression, and indexing, to help choose the appropriate storage model.

Big DataColumnar StorageOLAP
0 likes · 10 min read
Columnar Storage vs Row Storage: Overview, Write/Read Comparison, Pros, Cons, and Use Cases
ByteDance Data Platform
ByteDance Data Platform
May 30, 2022 · Databases

How UniqueMergeTree Boosts Real-Time Updates in ClickHouse Column Stores

UniqueMergeTree, a new ClickHouse table engine, addresses real‑time data update challenges by combining upsert semantics, unique key enforcement, and efficient delete‑bitmap handling, offering higher query performance at modest write cost, with detailed design, sharding strategies, conflict resolution, and performance evaluation.

ClickHouseColumnar StorageDatabase Engine
0 likes · 14 min read
How UniqueMergeTree Boosts Real-Time Updates in ClickHouse Column Stores
DataFunSummit
DataFunSummit
Mar 21, 2022 · Databases

Vectorization in Apache Doris: Design, Implementation, and Future Roadmap

This article explains how Apache Doris adopts CPU‑level vectorization and columnar storage to boost query performance, details the design and current status of its vectorized engine, and outlines future work such as JOIN acceleration, storage‑layer vectorization, import optimization, and extensive SQL function support.

Apache DorisColumnar StoragePerformance Optimization
0 likes · 21 min read
Vectorization in Apache Doris: Design, Implementation, and Future Roadmap
Efficient Ops
Efficient Ops
Mar 8, 2022 · Databases

From MongoDB to ClickHouse: Lessons Learned and Performance Gains

This article recounts the author's journey from using MongoDB for front‑end monitoring logs to migrating to ClickHouse, detailing the challenges with large‑scale data, optimization attempts, the fundamental differences between row‑ and column‑oriented databases, and the resulting performance and storage improvements.

Columnar StorageMongoDBNode.js
0 likes · 19 min read
From MongoDB to ClickHouse: Lessons Learned and Performance Gains
DataFunTalk
DataFunTalk
Feb 27, 2022 · Databases

Vectorization in Apache Doris: Design, Implementation, Current Status, and Future Plans

This article explains how Apache Doris adopts CPU vectorization techniques—such as SIMD, columnar storage, and cache‑friendly designs—to boost query performance, detailing its current vectorized engine architecture, recent benchmarks, ongoing work on JOIN, storage, import, and future enhancements.

Apache DorisColumnar StorageDatabase Performance
0 likes · 22 min read
Vectorization in Apache Doris: Design, Implementation, Current Status, and Future Plans
Tencent Database Technology
Tencent Database Technology
Jan 19, 2022 · Databases

Deep Dive into Tencent's Self‑Developed MySQL Kernel TXSQL and Its Architecture

This article provides a comprehensive overview of Tencent's self‑developed MySQL kernel TXSQL, covering its evolution, overall architecture, columnar storage engine, instant DDL capabilities, enterprise‑grade features, high‑availability mechanisms, performance optimizations, and the rigorous development and testing processes behind the product.

Columnar StoragePerformance OptimizationTXSQL
0 likes · 11 min read
Deep Dive into Tencent's Self‑Developed MySQL Kernel TXSQL and Its Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Aug 10, 2021 · Databases

Kudu Overview: Architecture, Features, and Use Cases

Kudu is an open‑source columnar storage engine from Cloudera that combines high‑throughput batch processing with low‑latency random reads, offering features such as C++/Java APIs, Raft‑based replication, flexible consistency, partitioning, and integration with Hadoop, Spark, Impala, and other ecosystem components.

Columnar StorageHadoopKudu
0 likes · 64 min read
Kudu Overview: Architecture, Features, and Use Cases
Baidu Geek Talk
Baidu Geek Talk
Aug 9, 2021 · Databases

BaikalDB Implementation Practice at Tongcheng Yilong: High Availability, HTAP, Performance and Cost Optimization

Tongcheng Yilong’s BaikalDB deployment combines high‑availability multi‑Raft HA, HTAP support, and share‑nothing scalability to deliver over 72K TPS OLTP and ten‑fold faster OLAP queries while cutting operational costs up to a hundredfold through dual‑center, columnar storage and cloud‑native elasticity.

BaikalDBColumnar StorageHTAP
0 likes · 27 min read
BaikalDB Implementation Practice at Tongcheng Yilong: High Availability, HTAP, Performance and Cost Optimization
Python Programming Learning Circle
Python Programming Learning Circle
Jul 9, 2021 · Databases

Key Features of ClickHouse: DBMS Capabilities, Columnar Storage, Vectorized Execution, and Distributed Architecture

ClickHouse is an MPP column‑oriented DBMS that combines full DBMS functionality, advanced columnar storage with high compression, SIMD‑based vectorized execution, a rich relational SQL interface, diverse table engines, multi‑master clustering, and flexible sharding and distributed query capabilities, making it exceptionally fast for analytical workloads.

ClickHouseColumnar StorageDBMS
0 likes · 21 min read
Key Features of ClickHouse: DBMS Capabilities, Columnar Storage, Vectorized Execution, and Distributed Architecture
Big Data Technology Architecture
Big Data Technology Architecture
Jun 17, 2021 · Databases

Key Features of ClickHouse: DBMS Capabilities, Columnar Storage, Vectorized Execution, and Distributed Architecture

ClickHouse is a high‑performance MPP column‑store DBMS that combines complete DBMS functions, column‑oriented storage with aggressive compression, SIMD‑based vectorized execution, flexible table engines, multithreading, distributed processing, a multi‑master architecture, and SQL compatibility to deliver fast online analytical queries on massive data sets.

ClickHouseColumnar StorageDBMS
0 likes · 21 min read
Key Features of ClickHouse: DBMS Capabilities, Columnar Storage, Vectorized Execution, and Distributed Architecture
ITPUB
ITPUB
Dec 29, 2020 · Databases

How BaikalDB’s Columnar Storage Boosted Real‑Time Analytics at DTCC2020

This article details how the DTCC2020 guest speaker from Tongcheng‑Elong introduced BaikalDB’s distributed columnar storage, covering internal and external motivations, technology comparison, architecture, implementation tricks, performance gains in production, and future hybrid row‑column research directions.

BaikalDBColumnar StorageDistributed Systems
0 likes · 12 min read
How BaikalDB’s Columnar Storage Boosted Real‑Time Analytics at DTCC2020
Programmer DD
Programmer DD
Oct 25, 2020 · Databases

Why ClickHouse Beats MySQL for OLAP: Migration, Performance & Pitfalls

This article explains what ClickHouse is, compares column‑store and row‑store databases, shows how to migrate large MySQL tables to ClickHouse, presents performance test results, discusses data synchronization methods, highlights why ClickHouse is fast, and shares common migration pitfalls.

ClickHouseColumnar StorageOLAP
0 likes · 7 min read
Why ClickHouse Beats MySQL for OLAP: Migration, Performance & Pitfalls
Tencent Cloud Developer
Tencent Cloud Developer
Oct 20, 2020 · Databases

ClickHouse: Architecture, Core Features, and Limitations for Interactive Analytics

ClickHouse is a PB‑scale, open‑source columnar OLAP database that uses a ZooKeeper‑coordinated sharded cluster, columnar storage, vectorized execution, advanced compression, data‑skipping indexes, and materialized views to deliver high‑performance interactive analytics, yet it requires manual shard management, lacks a mature MPP optimizer, and handles real‑time single‑row writes poorly.

ClickHouseColumnar StorageMaterialized Views
0 likes · 18 min read
ClickHouse: Architecture, Core Features, and Limitations for Interactive Analytics
Big Data Technology Architecture
Big Data Technology Architecture
Sep 30, 2020 · Databases

Core Technologies of OLAP Systems: Storage, Computation, Optimizer, and Emerging Trends

This article systematically examines the core technologies of OLAP systems, covering storage models, columnar formats, indexing, distributed storage architectures, query execution steps, optimizer designs, and emerging trends such as real‑time analytics, HTAP, cloud‑native deployment, and hardware acceleration.

Columnar StorageDistributed SystemsOLAP
0 likes · 23 min read
Core Technologies of OLAP Systems: Storage, Computation, Optimizer, and Emerging Trends
JD Cloud Developers
JD Cloud Developers
Sep 29, 2020 · Databases

Why ClickHouse Powers JD Cloud’s Billion‑Row Queries: Architecture and Performance Secrets

This article explains how JD Cloud’s JCHDB, built on ClickHouse, achieves millisecond‑level queries on billions of rows through columnar storage, distributed multi‑master architecture, SIMD vector engine, sparse indexing, and specialized table engines, and outlines the ideal use cases and deployment details.

Analytical DatabaseClickHouseColumnar Storage
0 likes · 10 min read
Why ClickHouse Powers JD Cloud’s Billion‑Row Queries: Architecture and Performance Secrets
Architects Research Society
Architects Research Society
Sep 1, 2020 · Databases

Understanding SAP HANA’s Combined Technologies: Memory, Columnar Storage, Compression, and Insert‑Only

The article explains SAP HANA’s performance advantages by combining four key technologies—high‑speed memory, columnar storage, data compression, and an insert‑only model—detailing their individual pros and cons, how they complement each other, and the trade‑offs involved in scaling and persistence.

Columnar StorageIn-MemoryInsert-Only
0 likes · 19 min read
Understanding SAP HANA’s Combined Technologies: Memory, Columnar Storage, Compression, and Insert‑Only
Big Data Technology Architecture
Big Data Technology Architecture
May 19, 2020 · Big Data

An Overview of Apache Parquet: Architecture, Features, and Comparison with ORC

Apache Parquet is a language‑agnostic, columnar storage format for the Hadoop ecosystem that offers high compression, efficient I/O through column and predicate push‑down, nested‑structure support, and a three‑layer architecture, and is compared with ORC while providing tooling for schema inspection.

Apache HadoopColumnar StorageData Formats
0 likes · 9 min read
An Overview of Apache Parquet: Architecture, Features, and Comparison with ORC
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 22, 2019 · Big Data

How AliORC Supercharges MaxCompute: Inside the Next‑Gen Columnar Format

This article explains how Alibaba's MaxCompute platform evolved its storage engine from row‑based CFile to the columnar AliORC format, details the technical innovations such as async prefetch, small I/O elimination, adaptive dictionary encoding, and range‑aligned reads, and compares its performance against Apache ORC and Parquet.

AliORCApache ORCColumnar Storage
0 likes · 20 min read
How AliORC Supercharges MaxCompute: Inside the Next‑Gen Columnar Format
Big Data Technology & Architecture
Big Data Technology & Architecture
Aug 14, 2019 · Big Data

Overview of Apache Druid Architecture and Its Comparison with Other Analytics Systems

This article provides a comprehensive overview of Apache Druid's distributed column‑store architecture, detailing its node types, external dependencies, data flow, and operational mechanisms, and compares Druid's real‑time analytics capabilities with systems such as Impala, Elasticsearch, and Spark.

Apache DruidColumnar Storagedistributed system
0 likes · 12 min read
Overview of Apache Druid Architecture and Its Comparison with Other Analytics Systems
360 Tech Engineering
360 Tech Engineering
Jul 18, 2019 · Databases

Principles and Practices of Apache Doris: Architecture, Key Technologies, and Real‑World Use Cases

This article presents a comprehensive overview of Apache Doris, covering its positioning as a distributed MPP analytical database, core architecture with FE and BE nodes, key technologies such as vectorized execution and materialized views, integration with Kafka and Elasticsearch, additional features, roadmap, and detailed case studies from Baidu Statistics and Meituan, illustrating its practical deployment and performance characteristics.

Apache DorisColumnar StorageData Warehouse
0 likes · 25 min read
Principles and Practices of Apache Doris: Architecture, Key Technologies, and Real‑World Use Cases
Sohu Tech Products
Sohu Tech Products
Dec 12, 2018 · Databases

Optimizing MySQL Performance with Read/Write Splitting, Columnar Storage, and Dynamic Scheduling

The article details a real‑world MySQL performance case where a sudden 100‑fold load increase was mitigated through read/write splitting, replica‑based statistics, limited index tuning, middleware‑driven sharding, and finally a columnar storage layer (Infobright) with scripted dynamic data synchronization, achieving dramatic latency reductions and scalable architecture.

Columnar StorageData WarehouseInfobright
0 likes · 12 min read
Optimizing MySQL Performance with Read/Write Splitting, Columnar Storage, and Dynamic Scheduling
dbaplus Community
dbaplus Community
Dec 9, 2018 · Databases

How Read‑Write Splitting and Columnar Storage Rescued a 100× MySQL Load Spike

A MySQL‑based receipt‑tracking service suffered a sudden 100‑fold load increase, prompting a step‑by‑step optimization that combined read‑write splitting, middleware‑less data routing, columnar storage with Infobright, and dynamic scheduling to dramatically lower CPU/IO pressure and restore performance.

Columnar StoragePerformance Optimizationdatabase scaling
0 likes · 13 min read
How Read‑Write Splitting and Columnar Storage Rescued a 100× MySQL Load Spike
Xianyu Technology
Xianyu Technology
Nov 27, 2018 · Big Data

Millisecond-Scale Multi-Dimensional Data Filtering with HybridDB for MySQL

HybridDB for MySQL delivers millisecond‑scale, multi‑dimensional filtering on billions of rows with hundreds of metrics by combining a high‑performance columnar engine, automatic composite indexes, and a fused MPP‑DAG pipeline, turning half‑day push preparation into seconds while supporting full SQL, spatial, and JSON data.

Columnar StorageHybridDBOLAP
0 likes · 8 min read
Millisecond-Scale Multi-Dimensional Data Filtering with HybridDB for MySQL
Hulu Beijing
Hulu Beijing
Feb 28, 2018 · Big Data

How Hulu’s Nesto Engine Delivers Near‑Real‑Time OLAP on TB‑Scale Data

This article introduces Hulu's in‑house OLAP engine Nesto, detailing its near‑real‑time data ingestion, nested data model, TB‑level storage using HBase and Parquet, MPP query execution, custom predicate library, and the overall architecture that enables sub‑second ad‑hoc queries for user analytics.

Big DataColumnar StorageDistributed Systems
0 likes · 22 min read
How Hulu’s Nesto Engine Delivers Near‑Real‑Time OLAP on TB‑Scale Data
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Feb 7, 2017 · Big Data

What’s New in Apache CarbonData 1.0.0? 80+ Features Boost Big Data Performance

Apache CarbonData 1.0.0, now an Apache incubating project, adds over 80 new features and bug fixes—including a new data loading solution, Spark 2.1 integration, update/delete SQL support, adaptive compression for numeric types, B‑Tree LRU cache, V2 format for faster first‑query performance, vectorized reader, bucket‑table joins, off‑heap memory, single‑pass loading, and pre‑generated dictionaries—aimed at delivering faster, more flexible, and efficient columnar storage for big‑data workloads.

Apache CarbonDataBig DataColumnar Storage
0 likes · 8 min read
What’s New in Apache CarbonData 1.0.0? 80+ Features Boost Big Data Performance
Hulu Beijing
Hulu Beijing
Dec 20, 2016 · Big Data

How Hulu Supercharges OLAP Queries with CarbonData: Real‑World Optimizations

This article describes Hulu’s real‑world OLAP query optimization, covering the fundamentals of OLAP, comparisons of row‑ and column‑based storage formats, detailed indexing mechanisms of Parquet, ORC and CarbonData, and the specific schema, shuffle, block size, speculation and GC tuning techniques that enabled CarbonData to dramatically accelerate wide‑table queries on SparkSQL.

Big DataCarbonDataColumnar Storage
0 likes · 17 min read
How Hulu Supercharges OLAP Queries with CarbonData: Real‑World Optimizations
dbaplus Community
dbaplus Community
Dec 16, 2015 · Databases

How DB2 BLU Accelerator Supercharges OLAP with Columnar Storage and SIMD

This article explains IBM DB2 BLU Accelerator’s columnar storage, multi‑level compression, TSN‑based logical rows, SIMD processing, intra‑parallel execution, probability‑based caching, and automatic admin features, showing how these technologies together deliver dramatic I/O and performance gains for analytical workloads.

BLU AcceleratorColumnar StorageDB2
0 likes · 15 min read
How DB2 BLU Accelerator Supercharges OLAP with Columnar Storage and SIMD

Architectural Overview and Optimization Techniques for SQL‑on‑Hadoop Systems

This article provides a comprehensive analysis of SQL‑on‑Hadoop architectures, comparing runtime‑framework‑based engines like Hive with MPP‑style engines such as Impala, detailing core components, compilation pipelines, optimizer strategies, CPU/IO performance tricks, columnar storage formats, and resource management in modern big‑data query platforms.

Columnar StorageQuery EngineSQL on Hadoop
0 likes · 22 min read
Architectural Overview and Optimization Techniques for SQL‑on‑Hadoop Systems