Backend Development 11 min read

Exporting Millions of Records with JPA and MyBatis Using Streaming and CSV in Spring Boot

This article explains how to avoid OutOfMemoryError when exporting massive MySQL datasets by streaming data with JPA or MyBatis, writing each record directly to a CSV file, and provides complete Spring Boot code examples, performance comparisons, and deployment tips.

Architect's Guide
Architect's Guide
Architect's Guide
Exporting Millions of Records with JPA and MyBatis Using Streaming and CSV in Spring Boot

Dynamic data export is a common requirement in many projects, but loading large tables (hundreds of thousands to millions of rows) into memory can cause OutOfMemoryError . The key principle is to avoid loading the entire result set at once and instead stream data in batches.

Using MySQL's streaming capabilities, data can be fetched row‑by‑row and written directly to a CSV file, which is more suitable than Excel for very large exports.

JPA Implementation

Define a repository method that returns a Stream<Todo> and annotate it with appropriate fetch size hints:

@QueryHints(value = @QueryHint(name = HINT_FETCH_SIZE, value = "" + Integer.MIN_VALUE))
@Query("select t from Todo t")
Stream
streamAll();

Mark the method as read‑only transactional and detach each entity after processing to free memory:

@Transactional(readOnly = true)
public void exportTodosCSV(HttpServletResponse response) {
    response.addHeader("Content-Type", "application/csv");
    response.addHeader("Content-Disposition", "attachment; filename=todos.csv");
    response.setCharacterEncoding("UTF-8");
    try (Stream
todoStream = todoRepository.streamAll()) {
        PrintWriter out = response.getWriter();
        todoStream.forEach(todo -> {
            out.write(todoToCSV(todo) + "\n");
            entityManager.detach(todo);
        });
        out.flush();
    } catch (IOException e) {
        throw new RuntimeException("Exception occurred while exporting results", e);
    }
}

MyBatis Implementation

Configure a custom ResultHandler and set fetchSize="-2147483648" in the mapper XML to enable streaming:

public class CustomResultHandler implements ResultHandler {
    private final DownloadProcessor downloadProcessor;
    public CustomResultHandler(DownloadProcessor downloadProcessor) { this.downloadProcessor = downloadProcessor; }
    @Override
    public void handleResult(ResultContext resultContext) {
        Authors authors = (Authors) resultContext.getResultObject();
        downloadProcessor.processData(authors);
    }
}

Mapper interface declares a streaming method:

List
streamByExample(AuthorsExample example); // returns a stream via MyBatis

Mapper XML adds the fetchSize attribute to the streaming select:

...

Service and Controller

The service provides both streaming and traditional download methods; the streaming version uses the custom ResultHandler to write each record to the response with minimal memory footprint.

@Service
public class AuthorsService {
    // ...
    public void streamDownload(HttpServletResponse response) throws IOException {
        // build params, create CustomResultHandler, invoke sqlSessionTemplate.select(...)
    }
    public void traditionDownload(HttpServletResponse response) throws IOException {
        List
authors = authorsMapper.selectByExample(new AuthorsExample());
        authors.forEach(new DownloadProcessor(response)::processData);
    }
}

The REST controller exposes two endpoints:

@RestController
@RequestMapping("download")
public class HelloController {
    private final AuthorsService authorsService;
    @GetMapping("streamDownload")
    public void streamDownload(HttpServletResponse response) throws IOException { authorsService.streamDownload(response); }
    @GetMapping("traditionDownload")
    public void traditionDownload(HttpServletResponse response) throws IOException { authorsService.traditionDownload(response); }
}

Performance Comparison

Testing shows the traditional approach peaks at ~2.5 GB memory usage, while the streaming approach stays below 500 MB, reducing memory consumption by about 80% while producing identical CSV files with over 2.7 million rows.

Both methods generate correct output; the streaming solution is recommended for production environments handling large data exports.

StreamingMyBatisSpringBootCSVMemoryOptimizationJPALargeDataExport
Architect's Guide
Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.