Databases 30 min read

What Are the Different Types of OLAP and How Do They Impact Performance?

This article provides a comprehensive overview of OLAP systems, classifying them by data volume and modeling approach, comparing MOLAP, ROLAP, HOLAP and HTAP, reviewing popular open‑source products, and detailing architectural, query‑optimization, vectorization, storage and resource‑management techniques that affect analytical warehouse performance.

dbaplus Community
dbaplus Community
dbaplus Community
What Are the Different Types of OLAP and How Do They Impact Performance?

OLAP Types

OLAP systems are classified by data volume (millions to billions of rows) and by modeling approach. The classic models are:

MOLAP – Multidimensional cubes are pre‑computed and stored in optimized multidimensional arrays. Queries are answered by direct array look‑ups, providing sub‑second response for fixed analytical workloads.

ROLAP – Relational model; queries are executed directly on fact and dimension tables using SQL without pre‑computation. It scales well with incremental loads but query latency depends on data size and query complexity.

HOLAP – Hybrid model that combines pre‑computed aggregates for hot queries (MOLAP) with on‑the‑fly relational processing for ad‑hoc queries (ROLAP).

HTAP – Hybrid Transaction/Analytical Processing, an extension of ROLAP that adds limited transactional capabilities.

Open‑Source ROLAP Products

Open‑source ROLAP engines fall into two design families:

Wide‑table models – Single, denormalized tables that give high scan speed but limited SQL features. Examples: Druid , ClickHouse , Elasticsearch , Solr .

Multi‑table (star/snowflake) models – Traditional relational schemas that support full SQL, at the cost of lower raw scan speed. Examples: Greenplum , Presto , Impala .

Performance‑Boosting Techniques for ROLAP

Most production ROLAP systems adopt a Massively Parallel Processing (MPP) architecture. MPP keeps intermediate results in memory buffers and pipelines operators, avoiding the heavy disk I/O of MapReduce.

Impala’s execution flow illustrates a typical MPP workflow:

Client (impala‑shell, JDBC, etc.) sends SQL to a Coordinator node.

The Coordinator parses the SQL and creates a single‑node logical plan.

The logical plan is transformed into a distributed set of Plan Fragments (PF). Each PF contains one or more operators.

PFs are dispatched to Executor nodes, which run them in parallel.

Executors read data from storage (HDFS, HBase, etc.), perform local shuffles, joins, and aggregations.

Partial results are sent back to the Coordinator, which merges them and returns the final result set.

Impala MPP architecture
Impala MPP architecture

Cost‑Based Query Optimization

Two optimization strategies are common:

Rule‑Based Optimization (RBO) – Applies fixed rewrite rules such as predicate push‑down, projection push‑down, constant folding, and hash‑join selection.

Cost‑Based Optimization (CBO) – Uses detailed column statistics (min/max, histograms, distinct counts, null counts) to estimate the execution cost of alternative plans. CBO decides:

Join algorithm: broadcast (small table) vs. partitioned (large tables).

Join order: left‑deep tree (LDT) vs. bushy tree (BYT) based on estimated cardinalities.

CBO join algorithm selection
CBO join algorithm selection

Vectorized Execution and Dynamic Code Generation

The traditional Volcano iterator model processes one tuple at a time, incurring many virtual‑function calls, poor CPU cache utilization, and branch‑prediction penalties.

Vectorized engines address these issues by processing batches of columnar data (e.g., 1024 rows) in a tight loop, often using SIMD instructions. This reduces function‑call overhead and improves cache locality.

Dynamic code generation further eliminates interpreter overhead by emitting native code for the specific query at runtime. Implementations include:

LLVM‑based JIT in Impala (C++).

Reflection‑based code generation in Spark SQL (JVM).

Combined, vectorization and JIT can yield order‑of‑magnitude speedups over Volcano.

Storage and I/O Optimizations

Key techniques for columnar storage systems (Parquet, ORC, etc.) are:

Compression – zlib, Snappy, LZ4. Choice balances CPU cost vs. space savings.

Lightweight encoding – Run‑Length Encoding (RLE) and dictionary encoding reduce the amount of data that must be decoded for many aggregations.

Fine‑grained partitioning – Hash or range partitions prune whole partitions during scan.

Row groups – Logical blocks that store columns together, enabling column‑wise reads while preserving row‑wise access patterns.

Local indexes & rich statistics – Min/max, histograms, row counts per row group allow early pruning.

Runtime filters – Bloom‑filter based filters in Impala and dynamic partition pruning in Spark SQL push join‑side predicates to the scan side, dramatically reducing scanned data.

Columnar storage layout
Columnar storage layout

Cluster Resource Management for Low‑Latency OLAP

When OLAP workloads run on YARN, the overhead of launching an ApplicationMaster and allocating containers can dominate query latency. Common mitigations are:

Long‑lived ApplicationMaster processes that stay resident across queries.

Container reuse pools to avoid repeated allocation.

These techniques keep executors ready, allowing queries to start immediately.

HDFS Short‑Circuit Local Reads (Configuration Example)

Enabling short‑circuit reads lets an OLAP engine read HDFS blocks directly from the local DataNode, bypassing the DataNode RPC path.

<property>
  <name>dfs.client.read.shortcircuit</name>
  <value>true</value>
</property>
<property>
  <name>dfs.domain.socket.path</name>
  <value>/var/lib/hadoop-hdfs/dn_socket</value>
</property>

Setting dfs.client.read.shortcircuit=true activates the feature; dfs.domain.socket.path points to the Unix domain socket used for local communication.

Summary

The article synthesizes recent research on OLAP architectures, focusing on:

Classification of OLAP models (MOLAP, ROLAP, HOLAP, HTAP).

Key open‑source ROLAP engines and their design trade‑offs.

Performance‑critical techniques: MPP execution, cost‑based optimization, vectorized execution, JIT code generation, storage compression/encoding, runtime filters, and YARN resource tuning.

Understanding and applying these techniques enables sub‑second to low‑second query latencies on data warehouses ranging from hundreds of millions to tens of billions of rows.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

query optimizationData WarehouseHTAPOLAPvectorizationMOLAPROLAPStorage Compression
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.