Master MyBatis Streaming Queries: Reduce Memory Usage with Cursors

This article explains MyBatis streaming queries, introduces the Cursor interface and its key methods, shows how to implement streaming with code examples, discusses suitable application scenarios, and outlines important considerations for efficient, low‑memory data processing in Java backend systems.

Java High-Performance Architecture
Java High-Performance Architecture
Java High-Performance Architecture
Master MyBatis Streaming Queries: Reduce Memory Usage with Cursors

Preface

MyBatis streaming queries are a lesser‑known but highly effective technique for handling large result sets without loading all data into memory. This article uncovers the concept, the Cursor API, implementation details, practical scenarios, and usage precautions.

Environment Configuration

JDK version: 1.8

IDE: IntelliJ IDEA 2020.1

Spring Boot: 2.3.9.RELEASE

mybatis-spring-boot-starter: 2.1.4

What Is MyBatis Streaming Query?

When MyBatis executes a query, it can return an iterator (Cursor) instead of a full collection, allowing the application to fetch rows one by one and avoid excessive memory consumption.

Cursor Interface

The org.apache.ibatis.cursor.Cursor interface defines three essential methods:

isOpen() : Checks whether the cursor is currently open.

isConsumed() : Determines whether all rows have been read.

getCurrentIndex() : Returns the index of the last row read (‑1 indicates no more data).

public interface Cursor<T> extends Closeable, Iterable<T> {<br/>    // Returns true if the cursor has started fetching data from the database<br/>    boolean isOpen();<br/>    // Returns true when all rows matching the SQL have been consumed<br/>    boolean isConsumed();<br/>    // Returns the index of the current row; first row is 0, -1 means no more rows<br/>    int getCurrentIndex();<br/>}

Code Implementation

Streaming queries keep the SQL session open while data is being read, and the transaction is committed only after processing is complete.

1. DAO definition returning a Cursor:

@Mapper<br/>public interface PersonDao {<br/>    Cursor<Person> selectByCursor();<br/>    Integer queryCount();<br/>}

2. Mapper XML:

<select id="selectByCursor" resultMap="personMap"><br/>    SELECT * FROM sys_person ORDER BY id DESC<br/></select><br/><select id="queryCount" resultType="java.lang.Integer"><br/>    SELECT COUNT(*) FROM sys_person<br/></select>

3. Service layer processing the cursor in batches of 1,000 rows, using a separate thread to keep the session alive:

@Service<br/>@Slf4j<br/>public class PersonServiceImpl implements IPersonService {<br/>    @Autowired<br/>    private SqlSessionFactory sqlSessionFactory;<br/><br/>    @Override<br/>    public void getOneByAsync() throws InterruptedException {<br/>        new Thread(() -> {<br/>            log.info("----开启sqlSession");<br/>            SqlSession sqlSession = sqlSessionFactory.openSession();<br/>            try {<br/>                PersonDao mapper = sqlSession.getMapper(PersonDao.class);<br/>                Cursor<Person> cursor = mapper.selectByCursor();<br/>                Integer total = mapper.queryCount();<br/>                List<Person> personList = new ArrayList<>();<br/>                int batch = 0;<br/>                for (Person person : cursor) {<br/>                    if (personList.size() < 1000) {<br/>                        personList.add(person);<br/>                    } else {<br/>                        batch++;<br/>                        log.info("----{}、从cursor取数据达到1000条,开始处理数据", batch);<br/>                        // Simulate processing<br/>                        Thread.sleep(1000);<br/>                        log.info("----{}、从cursor中取出的1000条数据已经处理完毕", batch);<br/>                        personList.clear();<br/>                        personList.add(person);<br/>                    }<br/>                    if (total == cursor.getCurrentIndex() + 1) {<br/>                        batch++;<br/>                        log.info("----{}、从cursor取数据达到1000条,开始处理数据", batch);<br/>                        Thread.sleep(1000);<br/>                        log.info("----{}、从cursor中取出的1000条数据已经处理完毕", batch);<br/>                        personList.clear();<br/>                    }<br/>                }<br/>                if (cursor.isConsumed()) {<br/>                    log.info("----查询sql匹配中的数据已经消费完毕!");<br/>                }<br/>                sqlSession.commit();<br/>                log.info("----提交事务");<br/>            } catch (Exception e) {<br/>                e.printStackTrace();<br/>                sqlSession.rollback();<br/>            } finally {<br/>                if (sqlSession != null) {<br/>                    sqlSession.close();<br/>                    log.info("----关闭sqlSession");<br/>                }<br/>            }<br/>        }).start();<br/>    }<br/>}

Application Scenarios

Streaming queries are ideal when processing massive datasets, such as generating a payroll report for 500,000 employees, where loading all rows at once would cause memory overflow and long garbage‑collection pauses. By reading data in manageable batches, memory usage stays low and processing can be parallelized.

Precautions

The primary goal of MyBatis streaming queries is to prevent out‑of‑memory errors by fetching rows incrementally. Developers must keep the SQL session open while iterating, manually manage transaction commit/rollback, and be aware that prolonged open connections may increase query time and require multithreading for efficiency.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaMyBatisCursorStreaming Query
Java High-Performance Architecture
Written by

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.