Exporting Millions of Records with JPA and MyBatis Using Streaming and CSV in Spring Boot

This article explains how to avoid OutOfMemoryError when exporting massive MySQL datasets by streaming data with JPA or MyBatis, writing each record directly to a CSV file, and provides complete Spring Boot code examples, performance comparisons, and deployment tips.

Architect's Guide
Architect's Guide
Architect's Guide
Exporting Millions of Records with JPA and MyBatis Using Streaming and CSV in Spring Boot

Dynamic data export is a common requirement in many projects, but loading large tables (hundreds of thousands to millions of rows) into memory can cause OutOfMemoryError. The key principle is to avoid loading the entire result set at once and instead stream data in batches.

Using MySQL's streaming capabilities, data can be fetched row‑by‑row and written directly to a CSV file, which is more suitable than Excel for very large exports.

JPA Implementation

Define a repository method that returns a Stream<Todo> and annotate it with appropriate fetch size hints:

@QueryHints(value = @QueryHint(name = HINT_FETCH_SIZE, value = "" + Integer.MIN_VALUE))
@Query("select t from Todo t")
Stream<Todo> streamAll();

Mark the method as read‑only transactional and detach each entity after processing to free memory:

@Transactional(readOnly = true)
public void exportTodosCSV(HttpServletResponse response) {
    response.addHeader("Content-Type", "application/csv");
    response.addHeader("Content-Disposition", "attachment; filename=todos.csv");
    response.setCharacterEncoding("UTF-8");
    try (Stream<Todo> todoStream = todoRepository.streamAll()) {
        PrintWriter out = response.getWriter();
        todoStream.forEach(todo -> {
            out.write(todoToCSV(todo) + "
");
            entityManager.detach(todo);
        });
        out.flush();
    } catch (IOException e) {
        throw new RuntimeException("Exception occurred while exporting results", e);
    }
}

MyBatis Implementation

Configure a custom ResultHandler and set fetchSize="-2147483648" in the mapper XML to enable streaming:

public class CustomResultHandler implements ResultHandler {
    private final DownloadProcessor downloadProcessor;
    public CustomResultHandler(DownloadProcessor downloadProcessor) { this.downloadProcessor = downloadProcessor; }
    @Override
    public void handleResult(ResultContext resultContext) {
        Authors authors = (Authors) resultContext.getResultObject();
        downloadProcessor.processData(authors);
    }
}

Mapper interface declares a streaming method:

List<Authors> streamByExample(AuthorsExample example); // returns a stream via MyBatis

Mapper XML adds the fetchSize attribute to the streaming select:

<select id="streamByExample" fetchSize="-2147483648" ...> ... </select>

Service and Controller

The service provides both streaming and traditional download methods; the streaming version uses the custom ResultHandler to write each record to the response with minimal memory footprint.

@Service
public class AuthorsService {
    // ...
    public void streamDownload(HttpServletResponse response) throws IOException {
        // build params, create CustomResultHandler, invoke sqlSessionTemplate.select(...)
    }
    public void traditionDownload(HttpServletResponse response) throws IOException {
        List<Authors> authors = authorsMapper.selectByExample(new AuthorsExample());
        authors.forEach(new DownloadProcessor(response)::processData);
    }
}

The REST controller exposes two endpoints:

@RestController
@RequestMapping("download")
public class HelloController {
    private final AuthorsService authorsService;
    @GetMapping("streamDownload")
    public void streamDownload(HttpServletResponse response) throws IOException { authorsService.streamDownload(response); }
    @GetMapping("traditionDownload")
    public void traditionDownload(HttpServletResponse response) throws IOException { authorsService.traditionDownload(response); }
}

Performance Comparison

Testing shows the traditional approach peaks at ~2.5 GB memory usage, while the streaming approach stays below 500 MB, reducing memory consumption by about 80% while producing identical CSV files with over 2.7 million rows.

Both methods generate correct output; the streaming solution is recommended for production environments handling large data exports.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

StreamingSpringBootCSVMemoryOptimizationjpaLargeDataExport
Architect's Guide
Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.