Backend Development 11 min read

Why Your Spring Boot App Freezes at One Million Records – 5 Proven Techniques to Double Performance

When a Spring Boot application reaches millions of rows, it often suffers from OutOfMemoryErrors, slow queries, and high CPU, but by applying five proven strategies—pagination, streaming, batch processing, indexing, and asynchronous execution—you can halve memory usage and achieve up to ten‑fold speed gains.

LuTiao Programming

Mar 14, 2026

Why Your Spring Boot App Freezes at One Million Records – 5 Proven Techniques to Double Performance

Spring Boot Strategies for Massive Data

When a Spring Boot application loads millions of rows with a single query such as List<Order> orders = orderRepository.findAll();, the JPA provider loads every record into the JVM. This causes an immediate memory surge, frequent Full GC, long response times, and eventually OutOfMemoryError . The same pattern also leads to slow database queries, API timeouts, CPU spikes, and service crashes.

1. Pagination – Read Only What You Need

Spring Data provides a mature pagination API. By requesting a fixed page size (e.g., 1,000 rows), only that subset is loaded into memory.

Pageable pageable = PageRequest.of(0, 1000);
Page<Order> page = orderRepository.findAll(pageable);
List<Order> orders = page.getContent();

Benefits observed in practice:

Memory usage drops dramatically.

JVM GC pressure is reduced.

Database load becomes predictable.

Response times stabilize.

Typical use cases include REST list endpoints, admin back‑ends, reporting modules, and search result pages.

2. Streaming – Process Rows One‑by‑One

For tasks that must scan the entire table (data migration, full‑table analytics, ETL), pagination can still be a bottleneck. Spring Data JPA can return a Stream<User> that lazily fetches rows.

// src/main/java/com/icoderoad/repository/UserRepository.java
package com.icoderoad.repository;

import java.util.stream.Stream;
import org.springframework.data.jpa.repository.Query;

public interface UserRepository {
    @Query("SELECT u FROM User u")
    Stream<User> streamAllUsers();
}

try (Stream<User> users = repository.streamAllUsers()) {
    users.forEach(this::processUser);
}

The stream reads and processes each record sequentially, consuming virtually no heap memory. Closing the stream is mandatory to avoid connection leaks.

3. Batch Processing – Reduce SQL Calls

Writing each row with an individual save generates one SQL statement per record. For one million rows this means one million round‑trips, overwhelming the database.

Hibernate batch settings group inserts/updates:

# src/main/resources/application.yml
spring:
  jpa:
    properties:
      hibernate:
        jdbc:
          batch_size: 1000
        order_inserts: true
        order_updates: true

Effect on a 1,000,000‑row load:

Single inserts → 1,000,000 SQL statements.

Batch inserts (size 1000) → 1,000 SQL statements.

Benchmarks reported by the author show a 10‑fold to 50‑fold throughput increase for bulk import, ETL pipelines, and large‑scale synchronization.

4. Index Optimization – Avoid Full Table Scans

A query such as SELECT * FROM orders WHERE customer_id = 1001; on a 10‑million‑row table without an index forces a full table scan, which is extremely slow.

Creating a B‑Tree index on the filter column changes the execution plan to an indexed lookup:

CREATE INDEX idx_orders_customer_id ON orders(customer_id);

Guidelines (as stated): index query predicates, sorting columns, join keys, and high‑frequency filter fields. Over‑indexing is discouraged because it degrades insert/update performance and consumes storage.

5. Asynchronous Processing – Parallel Execution

Synchronous handling of heavy tasks (order creation, email sending, inventory update, log generation) blocks the API and creates a bottleneck. Spring Boot’s async support enables parallel execution.

@EnableAsync

// src/main/java/com/icoderoad/service/OrderService.java
@Async
public CompletableFuture<Void> processOrder(Order order) {
    // processing logic
    return CompletableFuture.completedFuture(null);
}

Advantages observed:

Tasks run concurrently, reducing overall latency.

API responses become faster.

Background work can be offloaded to message queues such as Kafka or RabbitMQ for distributed processing and higher fault tolerance.

Bonus: Avoid Returning Huge Payloads from APIs

Returning hundreds of thousands of rows in a single response freezes browsers, slows network transfer, and overloads services. Recommended mitigations (directly cited): pagination, additional filter criteria, response compression, or cursor‑based pagination.

Real‑World Case: Optimizing 10 Million Transaction Records

Original design:

Loaded all data at once.

Updated each record individually.

Processed synchronously.

Consequences:

Memory usage spiked.

Processing time lasted several hours.

Frequent crashes.

Applied techniques:

Pagination – limited memory per batch.

Streaming – continuous row‑by‑row processing.

Batch – reduced SQL count from 1,000,000 to 1,000.

Index – accelerated query lookups.

Async – parallelized work.

Results after optimization:

Processing time dropped from hours to minutes.

Memory consumption reduced by ~80 %.

System stability restored.

Architectural Recommendation

In a scalable Spring Boot architecture, every layer (controller, service, repository) must avoid loading large data sets in a single step. The core principle is: Never process all data at once in any layer.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance optimization Indexing Batch Processing Streaming Spring Boot pagination Asynchronous Execution

Written by

LuTiao Programming

LuTiao Programming is a friendly community offering free programming lessons. We inspire learners to explore new ideas and technologies and quickly acquire job-ready skills.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Spring Boot Strategies for Massive Data

1. Pagination – Read Only What You Need

2. Streaming – Process Rows One‑by‑One

3. Batch Processing – Reduce SQL Calls

4. Index Optimization – Avoid Full Table Scans

5. Asynchronous Processing – Parallel Execution

Bonus: Avoid Returning Huge Payloads from APIs

Real‑World Case: Optimizing 10 Million Transaction Records

Architectural Recommendation

LuTiao Programming

How this landed with the community

Was this worth your time?

0 Comments

Real‑World Case: Optimizing 10 Million Transaction Records