How DB2 BLU Accelerator Supercharges OLAP with Columnar Storage and SIMD
This article explains IBM DB2 BLU Accelerator’s columnar storage, multi‑level compression, TSN‑based logical rows, SIMD processing, intra‑parallel execution, probability‑based caching, and automatic admin features, showing how these technologies together deliver dramatic I/O and performance gains for analytical workloads.
Overview of DB2 BLU Accelerator
IBM introduced the BLU Accelerator in DB2 10.5 to bring columnar storage, in‑memory dynamic processing, advanced compression, SIMD execution, and parallel vector processing to the DB2 platform, targeting high‑performance OLAP workloads.
Columnar Storage vs. Row Storage
Columnar storage keeps data by column, which is ideal for queries that need only a subset of columns. In row storage, the entire row page must be read, causing unnecessary I/O when only a few columns are required. BLU stores each column as a separate unit identified by a TSN (Table Sequence Number) and maps rows logically via TSN lists, allowing selective page reads for column‑only queries.
TSN, Page Map, and Synopsis Table
Each data page contains a TSN map. DB2 maintains a page‑map index and a Synopsis table that records the minimum and maximum TSN values and column ranges for each block (typically every 1024 TSNs). During query execution, the Synopsis table enables data‑skipping by filtering out pages whose ranges do not satisfy the predicate.
Multi‑Level Compression
When loading data with LOAD REPLACE, LOAD REPLACE RESETDICTIONARY, LOAD REPLACE RESETDICTIONARYONLY or LOAD INSERT, DB2 creates column‑level dictionary compression. New rows are compressed using table‑level dictionaries, and each page also carries a page‑level dictionary, providing two‑layer compression. The algorithms combine near‑optimal Huffman coding, prefix coding, and delta compression while preserving order, allowing predicates to be evaluated directly on compressed values.
Query Processing on Compressed Data
Predicate values are compressed before comparison, so only matching rows are decompressed. The engine uses the Synopsis table to skip irrelevant TSN ranges, dramatically reducing I/O and CPU work. Examples include range predicates (e.g., x1 > x2 becomes max(x1) > min(x2)) and single‑column IN lists, which are transformed into min/max checks.
SIMD (Single Instruction Multiple Data) Execution
BLU leverages SIMD instructions to process multiple operands in parallel using 128‑bit registers. Because data is highly compressed, more values fit into a SIMD register, enabling simultaneous arithmetic or predicate evaluation and boosting CPU efficiency.
Automatic Parallelism (Intra‑Parallel)
BLU enables the INTRA_PARALLEL setting, allowing the workload manager to distribute work across CPU cores. Column‑specific processing stays on a single core, eliminating cross‑core data shuffling and improving cache utilization.
Probability‑Based Caching
Instead of traditional LRU, BLU uses a probability cache that keeps frequently accessed compressed data in memory. Since 70‑80% of the needed data often fits in memory after compression, the cache reduces unnecessary page evictions and I/O for analytical queries.
Reduced Administrative Overhead
With BLU, indexes and materialized aggregates are unnecessary; DB2 automatically handles REORG, RUNSTATS, and optimizer hints. Users simply load data into a BLU‑enabled database and set DB2_WORKLOAD=ANALYTICS to activate automatic space reclamation, memory tuning, and workload management.
Shadow Tables for Hybrid Workloads
DB2 provides Shadow Tables and CDC tools to replicate OLTP row tables into corresponding BLU column tables, supporting mixed OLTP/OLAP (OLTAP) scenarios.
Performance Claims
BLU can achieve up to 10× storage savings compared to uncompressed row tables and improve query performance by 6‑124× for typical analytical workloads.
Selected Q&A Highlights
BLU is GA soon and already supports MPP on BLU‑mix (e.g., DashDB).
Both Power and x86 (Intel/AMD) platforms are supported.
Compression preserves order, enabling efficient range, >, <, BETWEEN, and join predicates without full decompression.
Recommended hardware: 16 cores per 3 TB raw data, 16 GB memory per core.
BLU uses a separate CTQ execution plan and data‑skipping in the executor.
Synopsis tables differ from regular DB2 synopsis tables; BLU’s are called “synopsis tables.”
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
