Backend Development 11 min read

Exporting Millions of Records with JPA and MyBatis Using Streaming and CSV in Spring Boot

This article explains how to avoid OutOfMemoryError when exporting massive MySQL datasets by streaming data with JPA or MyBatis, writing each record directly to a CSV file, and provides complete Spring Boot code examples, performance comparisons, and deployment tips.

Architect's Guide

May 5, 2023

Exporting Millions of Records with JPA and MyBatis Using Streaming and CSV in Spring Boot

Dynamic data export is a common requirement in many projects, but loading large tables (hundreds of thousands to millions of rows) into memory can cause OutOfMemoryError. The key principle is to avoid loading the entire result set at once and instead stream data in batches.

Using MySQL's streaming capabilities, data can be fetched row‑by‑row and written directly to a CSV file, which is more suitable than Excel for very large exports.

JPA Implementation

Define a repository method that returns a Stream<Todo> and annotate it with appropriate fetch size hints:

@QueryHints(value = @QueryHint(name = HINT_FETCH_SIZE, value = "" + Integer.MIN_VALUE))
@Query("select t from Todo t")
Stream<Todo> streamAll();

Mark the method as read‑only transactional and detach each entity after processing to free memory:

@Transactional(readOnly = true)
public void exportTodosCSV(HttpServletResponse response) {
    response.addHeader("Content-Type", "application/csv");
    response.addHeader("Content-Disposition", "attachment; filename=todos.csv");
    response.setCharacterEncoding("UTF-8");
    try (Stream<Todo> todoStream = todoRepository.streamAll()) {
        PrintWriter out = response.getWriter();
        todoStream.forEach(todo -> {
            out.write(todoToCSV(todo) + "
");
            entityManager.detach(todo);
        });
        out.flush();
    } catch (IOException e) {
        throw new RuntimeException("Exception occurred while exporting results", e);
    }
}

MyBatis Implementation

Configure a custom ResultHandler and set fetchSize="-2147483648" in the mapper XML to enable streaming:

public class CustomResultHandler implements ResultHandler {
    private final DownloadProcessor downloadProcessor;
    public CustomResultHandler(DownloadProcessor downloadProcessor) { this.downloadProcessor = downloadProcessor; }
    @Override
    public void handleResult(ResultContext resultContext) {
        Authors authors = (Authors) resultContext.getResultObject();
        downloadProcessor.processData(authors);
    }
}

Mapper interface declares a streaming method:

List<Authors> streamByExample(AuthorsExample example); // returns a stream via MyBatis

Mapper XML adds the fetchSize attribute to the streaming select:

<select id="streamByExample" fetchSize="-2147483648" ...> ... </select>

Service and Controller

The service provides both streaming and traditional download methods; the streaming version uses the custom ResultHandler to write each record to the response with minimal memory footprint.

@Service
public class AuthorsService {
    // ...
    public void streamDownload(HttpServletResponse response) throws IOException {
        // build params, create CustomResultHandler, invoke sqlSessionTemplate.select(...)
    }
    public void traditionDownload(HttpServletResponse response) throws IOException {
        List<Authors> authors = authorsMapper.selectByExample(new AuthorsExample());
        authors.forEach(new DownloadProcessor(response)::processData);
    }
}

The REST controller exposes two endpoints:

@RestController
@RequestMapping("download")
public class HelloController {
    private final AuthorsService authorsService;
    @GetMapping("streamDownload")
    public void streamDownload(HttpServletResponse response) throws IOException { authorsService.streamDownload(response); }
    @GetMapping("traditionDownload")
    public void traditionDownload(HttpServletResponse response) throws IOException { authorsService.traditionDownload(response); }
}

Performance Comparison

Testing shows the traditional approach peaks at ~2.5 GB memory usage, while the streaming approach stays below 500 MB, reducing memory consumption by about 80% while producing identical CSV files with over 2.7 million rows.

Both methods generate correct output; the streaming solution is recommended for production environments handling large data exports.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Streaming springboot CSV MemoryOptimization jpa LargeDataExport

Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.