Databases 11 min read

How Database Compression Boosts Performance While Cutting Storage Costs

This article examines why storage capacity limits IT systems, explains how database compression reduces disk usage and I/O time, discusses various dictionary‑based compression methods, related operations and commands, and evaluates compression ratios and overall impact on system performance.

21CTO

Mar 15, 2016

How Database Compression Boosts Performance While Cutting Storage Costs

Compression Methods

Most relational databases use dictionary‑based compression, extracting repeated information and replacing it with shorter symbols. Patterns and symbols are stored in a dictionary used for both compression and decompression. Variations include manual versus automatic dictionary creation, table‑level versus block‑level dictionaries, and column versus row compression.

Compression‑Related Operations

Data query: compressed data is read from disk and decompressed before returning results.

Data update: inserts and updates are compressed before storage; deletes may trigger dictionary updates.

Data load: similar to inserts, data is compressed during loading, sometimes creating dictionaries automatically.

Table reorganization: compresses or decompresses data based on the table's compression status.

Compression ratio evaluation: databases can estimate potential compression gains on uncompressed tables.

Index compression: uses different algorithms, not covered in depth here.

LOB compression: not applicable to relational row/column storage.

Log handling: logs must record compression‑related information to ensure consistency.

Backup and recovery: involve appropriate compression and decompression steps.

Compression‑Related Commands

While compression is transparent to users, some commands control it. For DB2 V9.7: CREATETABLE CUSTOMER ( ... ) COMPRESS YES; This creates a table with compression enabled. ALTERTABLE CUSTOMER COMPRESS YES; Enables compression on an existing table. ALTERTABLE CUSTOMER COMPRESS NO; Disables compression.

Note: enabling compression does not affect existing rows until a REORG is performed: REORG TABLE CUSTOMER; This scans the entire table and compresses all records, which may take considerable time.

Compression Ratio

Compression ratios vary based on data redundancy, distribution, and clustering. Highly redundant data compresses well, while low‑redundancy data yields little benefit. Column‑oriented compression can be more efficient for repetitive column values, but may conflict with traditional row storage. Proper indexing and clustering can improve ratios. Tests show compression can reduce database size by 70% or more without degrading, and sometimes improving, overall performance in I/O‑intensive workloads.

Conclusion

Database compression addresses storage bottlenecks by reducing disk usage and I/O time, often offsetting the extra CPU cost. Selecting appropriate compression methods and understanding their impact on operations helps achieve better performance and cost efficiency.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Tuning Storage Optimization DB2 database compression dictionary compression I/O reduction

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.