Databases 12 min read

Taming a Million‑Row Log Table: Real‑World SQL Performance Optimization

A detailed case study describes how a rapidly growing edit‑log feature caused query times to soar to 30 seconds, and walks through the step‑by‑step investigation, identification of a custom function bottleneck, data‑volume analysis, and the eventual implementation of partitioning, mandatory time filters, and composite indexing to restore acceptable performance.

dbaplus Community
dbaplus Community
dbaplus Community
Taming a Million‑Row Log Table: Real‑World SQL Performance Optimization

Background

The "edit log query" feature initially handled a small amount of data without issue, but as batch edits increased, daily log increments reached about 1 million rows, growing to 60 million rows within two months. Performance concerns surfaced when a user reported query times exceeding 30 seconds.

Initial Investigation

The author’s habit is to first scan the SQL before examining the execution plan. A custom function TimeZone_Date_Translator was identified as a likely bottleneck. Removing the function dramatically improved execution time, confirming the suspicion.

Escalation and Root Cause

Further testing with realistic conditions (querying a year’s worth of logs for a specific project) reproduced the 30‑second delay, revealing a result set of over 5 million rows and a base table size approaching 100 million rows. The primary cause was the sheer data volume.

Additional Bottleneck: Subquery for subtitlename

A scalar subquery retrieves subtitlename by joining different tables based on operate_type. When used as a filter, this subquery executes millions of times against large tables, becoming a serious performance hotspot.

Proposed Solutions

Introduce Table Partitioning : Partition RP_PLAN_LOG_T by operate_time on a monthly basis to enable partition pruning.

Make operate_time a Mandatory Filter : Require users to specify a time range, preferably within a single month, to limit scanned partitions.

Create a Composite Index : Add an index on (project_number, operate_time) to support the filtered queries efficiently.

Organizational Challenges

Implementation required coordination among developers, business analysts (BA), and DBAs. Responsibilities for partitioning and index creation were debated, leading to a consensus that DBAs would handle partitioning in the next release, while developers would add the composite index and enforce the time filter.

Further Optimizations

To address the subquery bottleneck, the author suggested splitting business_id into separate activity_id and attribute_id fields, allowing direct joins to smaller tables and avoiding costly scalar subqueries.

Outcome and Open Questions

After applying function removal, partitioning, mandatory time filters, and the composite index, query performance stabilized below the 5‑second threshold. Remaining questions include whether the data model could have been designed more comprehensively from the start, the true value of logging massive datasets, and how to handle dynamic query conditions without excessive indexing.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQLindexingdatabasePartitioningPL/SQL
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.