Master Spring Batch: From Basics to Advanced Architecture and Best Practices
This article introduces Spring Batch, a lightweight Java batch‑processing framework, explains its architecture and core concepts such as Job, Step, ItemReader/Writer/Processor, chunk handling, skip strategies, and provides practical guidelines for building reliable, high‑throughput batch jobs while avoiding common pitfalls like memory exhaustion.
Spring Batch Overview
Spring Batch is a lightweight, comprehensive batch processing framework provided by Spring, designed for enterprise applications that require high‑volume, automated, and reliable data processing without user interaction.
Automated processing of large data sets based on time‑driven events (e.g., month‑end calculations).
Periodic execution of complex business rules on massive data (e.g., insurance calculations).
Integration of data from internal and external systems, formatting, validation, and transactional writing.
Spring Batch offers reusable features such as transaction management, job restart, skip logic, and resource management, and it supports high‑throughput processing through partitioning and optimization techniques.
Spring Batch Architecture
A typical batch job reads records from a database, file, or queue, processes them, and writes the results back. The following diagram illustrates the overall flow.
The framework persists job metadata (JobInstance, JobExecution, StepExecution) in tables like batch_job_execution.
Core Concepts
Job
A Job represents the entire batch process. It is an interface with methods such as getName(), isRestartable(), and execute(JobExecution). Implementations include SimpleJob and FlowJob.
public interface Job {
String getName();
boolean isRestartable();
void execute(JobExecution execution);
JobParametersIncrementer getJobParametersIncrementer();
JobParametersValidator getJobParametersValidator();
}JobInstance
JobInstance identifies a logical execution of a Job, distinguished by parameters. It provides getInstanceId() and getJobName().
public interface JobInstance {
long getInstanceId();
String getJobName();
}JobParameters
JobParameters hold the parameters that launch a JobInstance, allowing each run (e.g., daily EndOfDay job) to be uniquely identified.
JobExecution
JobExecution represents a single attempt to run a Job. It provides execution ID, job name, batch status, start/end times, exit status, and the associated JobParameters.
public interface JobExecution {
long getExecutionId();
String getJobName();
BatchStatus getBatchStatus();
Date getStartTime();
Date getEndTime();
String getExitStatus();
Date getCreateTime();
Date getLastUpdatedTime();
Properties getJobParameters();
}Step and StepExecution
A Step is a distinct phase of a Job; StepExecution records each execution of a Step, including its ExecutionContext.
ExecutionContext
ExecutionContext stores key‑value pairs for a StepExecution or JobExecution, useful for restart data.
JobRepository and JobLauncher
JobRepository persists metadata for Jobs and Steps. JobLauncher starts a Job with given parameters.
public interface JobLauncher {
JobExecution run(Job job, JobParameters jobParameters) throws JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException, JobParametersInvalidException;
}ItemReader, ItemProcessor, ItemWriter
ItemReader reads input data, ItemProcessor applies business logic, and ItemWriter writes output. Spring Batch provides implementations such as JdbcPagingItemReader and JdbcCursorItemReader.
@Bean
public JdbcPagingItemReader itemReader(DataSource dataSource, PagingQueryProvider queryProvider) {
Map<String, Object> params = new HashMap<>();
params.put("status", "NEW");
return new JdbcPagingItemReaderBuilder<CustomerCredit>()
.name("creditReader")
.dataSource(dataSource)
.queryProvider(queryProvider)
.parameterValues(params)
.rowMapper(customerCreditMapper())
.pageSize(1000)
.build();
}Chunk Processing
Spring Batch processes records in chunks. A chunk size defines how many items are read and processed before a single transaction commit.
Skip Strategy and Failure Handling
skipLimitdefines how many exceptions a Step may skip before failing. skip specifies which exceptions can be ignored, while noSkip marks exceptions that must cause failure.
Note: If skipLimit is not set, the default is 0.
Batch Processing Guidelines
Simplify job logic and keep processing close to the data.
Minimize I/O and use in‑memory operations where possible.
Avoid redundant processing; allocate sufficient memory at startup.
Assume worst‑case data integrity and implement checksums.
Perform stress testing with realistic data volumes.
Plan backup strategies for both databases and files.
Disabling Automatic Job Startup
Set spring.batch.job.enabled=false in application.properties to prevent jobs from running on application start.
Handling Out‑of‑Memory Errors
When a reader loads all records at once, the JVM may run out of heap memory. Solutions include paging the reader or increasing service memory.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
