Databases 13 min read

Boost Query Speed in Million-Row Databases: Proven Optimization Techniques

This article presents a comprehensive set of practical strategies for improving database query performance, covering index design, SQL statement refinements, Java backend considerations, hardware tuning, and storage‑procedure usage to accelerate operations on large‑scale relational databases.

Java Backend Technology

Oct 15, 2016

Boost Query Speed in Million-Row Databases: Proven Optimization Techniques

Database Design Tips

A. Create indexes on columns used in WHERE and ORDER BY to avoid full table scans.

B. Avoid NULL checks in WHERE; set a default value (e.g., 0) and query with WHERE num = 0.

C. Indexes are ineffective when the indexed column has many duplicate values (e.g., a gender column with roughly equal male/female distribution).

D. Limit the number of indexes per table (recommended ≤ 6) because each index slows INSERT and UPDATE operations.

E. Minimize updates to indexed columns; changing the indexed column forces the engine to reorder rows, consuming significant resources.

F. Prefer numeric fields over character fields for better comparison performance and lower storage overhead.

G. Use VARCHAR/NVARCHAR instead of fixed‑length CHAR/NCHAR to save space and improve search speed.

H. Prefer table variables over temporary tables when possible; note that table variables have limited indexing (only primary key).

I. Reduce frequent creation and deletion of temporary tables to lower system‑table resource consumption.

J. Use temporary tables judiciously; for one‑time data sets, consider exporting to a permanent table.

K. For large bulk inserts into a new temporary table, use SELECT INTO instead of CREATE TABLE followed by INSERT to reduce logging.

L. Explicitly drop temporary tables at the end of a stored procedure (truncate then drop) to avoid long‑lasting locks.

SQL Statement Optimizations

A. Avoid != or <> in WHERE clauses; they can prevent index usage.

B. Avoid OR in WHERE; rewrite using UNION ALL or separate queries.

C. Use BETWEEN for continuous ranges instead of IN.

D. Patterns like LIKE '%abc%' cause full scans; use anchored patterns or full‑text indexes.

E. Parameterized predicates may lead to scans; force index usage with hints, e.g., SELECT id FROM t WITH (INDEX(index_name)) WHERE num = @num.

F. Do not apply arithmetic or functions to indexed columns in WHERE; rewrite as WHERE num = 100 instead of WHERE num/2 = 100.

G. Avoid functions on columns (e.g., SUBSTRING, DATEDIFF); replace with range conditions or LIKE patterns.

H. Do not place expressions on the left side of = in WHERE clauses.

I. Eliminate meaningless queries such as SELECT col1, col2 INTO #t FROM t WHERE 1 = 0; create the table directly instead.

J. Prefer EXISTS over IN for sub‑queries.

K. Avoid SELECT *; list required columns explicitly.

L. Avoid cursors for large data sets (>10,000 rows); rewrite set‑based logic.

M. Limit result set size sent to clients; assess necessity of large data transfers.

N. Keep transactions short to improve concurrency.

O. Use LIMIT 1 when only a single row is needed.

P. Never use ORDER BY RAND() on large tables.

Q. Avoid SELECT * (reiterated).

R. Use ENUM instead of VARCHAR for columns with a fixed set of values (e.g., gender, status).

S. Store IP addresses as UNSIGNED INT rather than VARCHAR(15) to save space and enable range queries.

Java Backend Recommendations

A. Minimize object creation.

B. Separate bulk data operations from small‑scale ones; bulk work should not rely on ORM frameworks.

C. Use JDBC for direct database access.

D. Stream data instead of loading entire result sets into memory.

E. Cache frequently accessed data when appropriate.

Additional Performance Enhancements

1) Hardware: Upgrade disk and network throughput, increase virtual memory, disable unnecessary services, separate database and application servers, enable multi‑processor usage.

2) Index strategy: Build clustered indexes on integer keys, create covering non‑clustered indexes for frequently queried columns, avoid excessive indexes and large data‑type columns in indexes.

3) Use stored procedures to encapsulate database logic, reduce network traffic, and improve modularity.

4) Optimize application algorithms and query logic; understand that proper indexing is a prerequisite but not sufficient for high performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Indexing Query Optimization Performance Tuning

Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.