Million‑Scale Data Export with JPA and MyBatis in Spring Boot

This article explains how to export tens of millions of rows from MySQL using Spring Boot by streaming data with JPA or MyBatis, avoiding OutOfMemoryError, switching to CSV format, and provides complete code examples, performance comparison, and tips for generating test data.

Architect
Architect
Architect
Million‑Scale Data Export with JPA and MyBatis in Spring Boot

Dynamic data export is a common requirement in many projects; the naive approach loads all rows from MySQL into memory and writes them to an Excel or CSV file, which quickly leads to OutOfMemoryError when the dataset reaches hundreds of thousands or millions of records.

To prevent OOM, the key principle is to avoid loading the full result set into memory and instead stream rows from the database, writing each row directly to the output file and discarding it from the session.

Because CSV files handle large row counts better than Excel (Excel 2007 caps at 1,048,576 rows), the article recommends using CSV for million‑level exports.

JPA implementation

Define a repository method that returns a Stream<Todo> and annotate it with

@QueryHints(name = HINT_FETCH_SIZE, value = "" + Integer.MIN_VALUE)

and @Query to fetch rows lazily. The method must also be marked @Transactional(readOnly = true) and the EntityManager should detach each entity after processing.

@QueryHints(value = @QueryHint(name = HINT_FETCH_SIZE, value = "" + Integer.MIN_VALUE))
@Query(value = "select t from Todo t")
Stream<Todo> streamAll();

The controller endpoint streams the CSV response:

@RequestMapping("/download")
public class HelloController {
    @GetMapping("streamDownload")
    public void streamDownload(HttpServletResponse response) throws IOException {
        response.addHeader("Content-Type", "application/csv");
        response.addHeader("Content-Disposition", "attachment; filename=todos.csv");
        response.setCharacterEncoding("UTF-8");
        try (Stream<Todo> todoStream = todoRepository.streamAll()) {
            PrintWriter out = response.getWriter();
            todoStream.forEach(todo -> {
                out.write(todoToCSV(todo));
                out.write("
");
                entityManager.detach(todo);
            });
            out.flush();
        }
    }
}

MyBatis implementation

MyBatis requires a custom ResultHandler and the fetchSize="-2147483648" attribute on the select statement to enable streaming.

public class CustomResultHandler implements ResultHandler {
    private final DownloadProcessor downloadProcessor;
    public CustomResultHandler(DownloadProcessor downloadProcessor) {
        this.downloadProcessor = downloadProcessor;
    }
    @Override
    public void handleResult(ResultContext resultContext) {
        Authors authors = (Authors) resultContext.getResultObject();
        downloadProcessor.processData(authors);
    }
}

The mapper defines both a traditional list method and a streaming method:

List<Authors> selectByExample(AuthorsExample example);
List<Authors> streamByExample(AuthorsExample example); // fetchSize="-2147483648"

The service layer provides two download methods: streamDownload (low memory) and traditionDownload (high memory). The streaming version uses sqlSessionTemplate.select(..., customResultHandler) to process rows one‑by‑one.

Performance testing shows the traditional approach peaks at ~2.5 GB RAM, while the streaming approach stays below 500 MB, an 80 % reduction.

For testing, a large dataset (≈2.7 M rows) can be generated via stored procedures or downloaded from the provided Baidu Cloud link.

Overall, the article demonstrates a practical, memory‑efficient way to export massive MySQL tables using Spring Boot, JPA, and MyBatis.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

StreamingMyBatisSpringBootCSVjpaDataExport
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.