Avoid These Common SQL Performance Pitfalls for Faster Queries
This guide enumerates frequent misconceptions in SQL performance—from over‑indexing and SELECT * misuse to improper transaction handling and outdated monitoring practices—explains why they hurt efficiency, and provides concrete, version‑aware solutions to optimize queries, schema design, and database operations.
1. Index and Query Pitfalls
Myth: More indexes always improve performance. Problem: Each index adds write overhead, consumes storage, and can mislead the optimizer, causing slower queries. Solution: Create indexes only on columns frequently used in WHERE , ORDER BY or JOIN conditions. Regularly drop unused indexes.
Myth: SELECT * is harmless. Problem: It reads unnecessary columns, increases I/O, and can trigger cache misses when large BLOB / TEXT fields are present. Solution: Explicitly list required columns in the SELECT clause.
Myth: JOIN table order does not matter. Problem: In some MySQL versions the optimizer does not reorder tables, leading to full scans when a large table is placed first. Solution: Place the smaller driving table on the left side of the JOIN (or rely on MySQL 8.0+ automatic reordering).
Myth: OR conditions have no impact on indexes. Problem: An OR without a covering composite index can cause full‑table scans. Solution: Replace OR with UNION ALL or ensure the involved columns are covered by a composite index.
Myth: OFFSET pagination is efficient for deep pages. Problem: OFFSET must scan and discard preceding rows, making deep pagination very slow. Solution: Use keyset pagination, e.g. <code>SELECT * FROM logs WHERE id > 100000 ORDER BY id LIMIT 10;</code>
Myth: UNION is always the right choice. Problem: UNION removes duplicates, adding sorting and deduplication cost. Solution: Use UNION ALL when duplicate removal is not required.
Myth: Subqueries are always safe. Problem: Complex subqueries may be executed repeatedly, creating temporary tables and hurting performance. Solution: Rewrite heavy subqueries as JOIN s or materialize them with CTEs.
Myth: LIKE '%text' can use an index. Problem: Leading wildcards prevent index usage, causing full scans. Solution: Use a trailing wildcard ( LIKE 'text%' ) or a full‑text search engine such as Elasticsearch.
Myth: Functions on indexed columns are free. Problem: Applying functions (e.g., YEAR(create_time) ) forces row‑by‑row computation, invalidating the index. Solution: Rewrite as a range query, e.g. <code>WHERE create_time BETWEEN '2023-01-01' AND '2023-12-31'</code> or create an expression index.
2. Database Design and Configuration Pitfalls
Myth: Index every column. Problem: Excessive indexes increase write cost and storage. Solution: Index only columns used for filtering, sorting, or joining.
Myth: Always normalize to the highest normal form. Problem: Over‑normalization leads to many JOIN s and performance penalties; under‑normalization can cause data anomalies. Solution: Balance normalization with denormalization based on read/write patterns, duplicating high‑frequency fields when needed.
Myth: Partition every large table. Problem: Wrong partition keys cause full‑partition scans and extra metadata overhead. Solution: Partition only tables with clear time‑range or range‑query patterns and ensure queries include the partition key.
Myth: Ignore hardware characteristics. Problem: SSDs require different tuning than HDDs (e.g., higher innodb_io_capacity , parallel reads). Solution: Adjust buffer pool size, I/O capacity, and parallel read threads according to storage media.
3. Transaction and Lock Pitfalls
Myth: Use the highest isolation level (SERIALIZABLE) by default. Problem: Increases lock contention and reduces concurrency. Solution: Choose the lowest isolation level that satisfies correctness, such as READ COMMITTED , and consider optimistic locking.
Myth: Row‑level locks are always optimal. Problem: In high‑contention scenarios they can upgrade to gap locks and cause deadlocks. Solution: Select appropriate lock granularity (row, table, or optimistic) based on workload.
Myth: Long transactions have no side effects. Problem: They hold locks longer, block other operations, and enlarge undo logs. Solution: Split large transactions into smaller batches and monitor transaction duration.
Myth: Monitoring slow queries is enough. Problem: Systemic bottlenecks (e.g., connection pool exhaustion, lock waits) are missed. Solution: Build a comprehensive monitoring stack (QPS, TPS, lock wait time, buffer pool hit rate) using tools like Prometheus and Grafana.
Myth: Foreign keys are always the best way to ensure consistency. Problem: They add write overhead and can cause deadlocks under high concurrency. Solution: Use foreign keys for core data integrity; handle non‑critical relationships in the application layer or via asynchronous reconciliation.
4. Database Feature Misuse
Myth: Rely solely on the InnoDB buffer pool for all performance problems. Problem: Cache miss rates rise when data exceeds memory or access is random. Solution: Complement with an external cache (e.g., Redis) for hot data.
Myth: Put all business logic into stored procedures. Problem: Complex procedures are hard to debug, increase CPU/memory pressure, and block other sessions. Solution: Keep stored procedures simple and move complex logic to the application layer.
Myth: Triggers simplify application code. Problem: Implicit execution is difficult to trace and can cause performance cascades. Solution: Avoid complex triggers; implement logic explicitly in transactions.
Myth: Deeply nested views improve readability. Problem: Optimizer may materialize each view, leading to performance degradation. Solution: Flatten view logic or replace with CTEs.
Myth: Temporary tables are free for intermediate results. Problem: Creation and destruction consume resources; large temp tables may spill to disk. Solution: Optimize queries to eliminate temp tables or use the MEMORY engine when appropriate.
5. Operations and Monitoring Pitfalls
Myth: Only analyze obvious slow queries. Problem: Frequent “healthy‑looking” queries can accumulate high CPU usage. Solution: Regularly parse the slow‑query log with tools like pt‑query‑digest .
Myth: Database versions never need upgrades. Problem: Older versions miss optimizer improvements and new features (CTE, window functions). Solution: Test newer releases in staging and upgrade when benefits outweigh migration risk.
Myth: Statistics are always up‑to‑date automatically. Problem: Stale statistics cause poor plan choices. Solution: Run ANALYZE TABLE (or equivalent) on tables with significant data changes, adjusting sample rates for very large tables.
Myth: Monitoring only SQL response time is sufficient. Problem: System‑level metrics (connection pool, lock wait, buffer pool hit rate) are ignored, masking root causes. Solution: Implement full‑stack monitoring and correlate database metrics with application performance.
6. Scenario and Task Fit Pitfalls
Myth: Use the database for compute‑intensive tasks. Problem: CPU is limited; heavy calculations block other queries. Solution: Offload complex calculations to the application layer or dedicated engines (e.g., Spark).
Myth: Same database can serve OLTP and OLAP workloads. Problem: Analytic queries consume resources needed for transactional workloads. Solution: Separate read/write replicas or use a dedicated data warehouse (e.g., ClickHouse).
Myth: HTAP workloads can run on the primary OLTP node. Problem: Mixed workloads cause CPU/memory contention. Solution: Deploy read‑write splitting middleware or create analytic replicas.
Myth: Sharding keys can be chosen arbitrarily. Problem: Poor sharding leads to cross‑node joins and network overhead. Solution: Design sharding keys based on access patterns and consider global secondary indexes.
Myth: Store large files (images, videos) in BLOB columns. Problem: Increases I/O, backup size, and reduces performance. Solution: Store files in object storage (e.g., S3) and keep only references in the database.
Myth: Simulate a message queue with a status table. Problem: Polling creates high load and unreliable delivery. Solution: Use a real message broker such as Kafka or RabbitMQ.
Myth: Keep hot and cold data together. Problem: Scanning irrelevant cold data slows queries. Solution: Archive old data or use partitioning to separate hot and cold segments.
Myth: Run full backups during peak traffic. Problem: Backup tools lock tables or generate heavy I/O, causing service degradation. Solution: Schedule backups in low‑traffic windows and use hot‑backup tools (e.g., Percona XtraBackup).
Myth: Insert rows one by one in loops. Problem: Each round‑trip adds latency and transaction overhead. Solution: Use bulk INSERT statements or batch commits.
7. SQL Performance Optimization Principles
Understand the optimizer’s cost model, balance read‑heavy and write‑heavy workloads, avoid relying solely on past experience, focus on high‑impact queries rather than micro‑optimizations, and adopt a full‑stack view that links schema design, query writing, and operational tuning.
8. Modern Database Improvements
Recent releases (MySQL 8.0+, PostgreSQL 12+) provide automatic join reordering, materialized CTEs, keyset pagination, expression indexes, and parallel query execution. Features such as MySQL HeatWave’s automatic tiered storage move hot data to memory and warm data to SSD, reducing I/O latency.
Key Code Examples
-- Keyset pagination (efficient deep paging)
SELECT * FROM orders WHERE id > 100000 ORDER BY id LIMIT 10; -- Function (expression) index for year‑based queries (MySQL 8.0+)
CREATE INDEX idx_hire_year ON employees ((YEAR(hire_date)));
SELECT * FROM employees WHERE YEAR(hire_date) = 2023; -- CTE materialization to avoid repeated subquery scans (PostgreSQL)
WITH order_counts AS (
SELECT user_id, COUNT(*) AS cnt FROM orders GROUP BY user_id
)
SELECT u.name, oc.cnt FROM users u LEFT JOIN order_counts oc ON u.id = oc.user_id;Illustrations
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
