How to Build Lightweight Batch Jobs with Spring Batch: A Practical Guide
This article explains the need for lightweight batch processing, outlines a layered architecture and robustness strategies, and demonstrates how Spring Batch implements these concepts with clear interfaces, job management, and support for ignore, retry, and restart mechanisms.
In daily development we often need data reports, statistical analysis, and scheduled tasks. Hadoop can handle these but is heavyweight, so we look for lightweight batch solutions.
How to achieve lightweight batch? Start with design concepts.
Lightweight Batch Basic Architecture
At the highest abstraction, batch processing is a flow: read data, process data, write data, interacting with various storage media.
Abstract Process of Batch Architecture
Like ordinary applications, we first define component responsibilities using layering.
The layered structure consists of three main tiers: infrastructure layer, core processing layer, and application development layer.
The infrastructure layer provides generic read, write, and processing services, encapsulating operations on different media. The core processing layer handles execution, task abstraction, and control. The application development layer contains business code.
With this layering we model the processing object as a Job. A Job contains one or more Steps; each Step interacts with external media and produces results.
Batch processing works on collections of data (Batch). Readers operate per item, processors transform or filter, and writers usually handle batches.
Robustness of Batch Processing
Each step may fail, so robustness mechanisms are needed to ensure completion without manual intervention.
Three common robustness strategies are ignore, retry, and restart.
Ignore means skipping non‑critical exceptions such as number format errors. Retry handles transient failures like network or database lock with limited attempts. Restart pauses the job, fixes the code, and reruns when business‑level errors occur.
A mature batch architecture combines these strategies and can switch dynamically, e.g., retry first, then restart after repeated failures.
Lightweight Batch Framework: Spring Batch
Among lightweight batch frameworks, Spring Batch offers a complete solution built on Spring and Java.
Spring Batch implements the basic architecture, supports robustness, and provides built‑in readers and writers for files, databases, messaging middleware, and external services, as well as transformation and filtering.
It also supports common scenarios such as scheduled jobs, sequential dependent tasks, partial processing, transactional batch, and message integration.
Key interfaces correspond to the three steps:
public interface ItemReader<T> {
T read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException;
}
public interface ItemProcessor<I, O> {
O process(I item) throws Exception;
}
public interface ItemWriter<T> {
void write(List<? extends T> items) throws Exception;
}ItemReader and ItemWriter handle various data sources; ItemProcessor performs transformation or filtering, which can be customized.
For robustness, Spring Batch supports Skip, Retry, and Restart by abstracting Job into JobInstance and JobExecution. Each Job has a single definition but may have multiple executions, stored in a Job Repository, which can be in‑memory or JDBC.
In summary, batch processing consists of reading, processing, and writing massive data sets, requiring automation, robustness, reliability, and performance. Spring Batch provides a lightweight, open‑source framework that aligns with these design principles.
Conclusion
We analyzed lightweight batch processing from architecture to robustness and demonstrated how Spring Batch implements these concepts.
Su San Talks Tech
Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.