Master Spring Batch: Core Concepts, Architecture, and Practical Tips
This article provides a comprehensive guide to Spring Batch, covering its purpose, architecture, core components such as Job, Step, ItemReader/Writer/Processor, chunk processing, skip strategies, configuration tips, and common memory issues, all illustrated with code examples and diagrams.
Spring Batch Overview
Spring Batch is a lightweight, comprehensive batch‑processing framework built on Spring. It is intended for enterprise applications that must process large volumes of data without user interaction, such as end‑of‑month calculations, insurance benefit determinations, or daily transaction processing.
Architecture
A typical batch job reads records from a database, file, or queue, processes them, and writes the results back. Spring Batch persists job metadata (job definitions, executions, step executions) in tables such as batch_job_execution, batch_job_instance, and batch_step_execution.
Core Concepts
Job
A Job is the top‑level container that groups one or more Step objects. The main interface defines methods for name, restartability, execution, parameter incrementer and validator.
public interface Job {
String getName();
boolean isRestartable();
void execute(JobExecution execution);
JobParametersIncrementer getJobParametersIncrementer();
JobParametersValidator getJobParametersValidator();
}JobInstance
A JobInstance represents a logical execution of a job identified by its parameters. It provides a unique instance ID and the job name.
public interface JobInstance {
long getInstanceId();
String getJobName();
}JobParameters
Parameters supplied at launch (e.g., a date) differentiate multiple instances of the same job definition. They are stored as a map of key‑value pairs and are used to locate or create a JobInstance.
JobExecution
A JobExecution records a single attempt to run a JobInstance. It stores status, start/end timestamps, exit status and the parameters used.
public interface JobExecution {
long getExecutionId();
String getJobName();
BatchStatus getBatchStatus();
Date getStartTime();
Date getEndTime();
String getExitStatus();
Date getCreateTime();
Date getLastUpdatedTime();
Properties getJobParameters();
}The BatchStatus enum includes
STARTING, STARTED, STOPPING, STOPPED, FAILED, COMPLETED, ABANDONED.
public enum BatchStatus {STARTING, STARTED, STOPPING, STOPPED, FAILED, COMPLETED, ABANDONED}Step and StepExecution
A Step encapsulates a distinct phase of a job. Each execution creates a StepExecution that tracks commit counts, timestamps and an ExecutionContext for state persistence.
ExecutionContext
A key‑value store attached to a StepExecution or JobExecution. It is used to keep data between restarts.
ExecutionContext stepEc = stepExecution.getExecutionContext();
ExecutionContext jobEc = jobExecution.getExecutionContext();JobRepository and JobLauncher
JobRepositorypersists jobs, steps and executions. JobLauncher starts a job with given parameters.
public interface JobLauncher {
JobExecution run(Job job, JobParameters jobParameters) throws JobExecutionAlreadyRunningException,
JobRestartException, JobInstanceAlreadyCompleteException, JobParametersInvalidException;
}ItemReader, ItemProcessor, ItemWriter
These abstractions handle input, transformation and output for each step. Spring Batch provides many implementations, e.g., JdbcPagingItemReader, JdbcCursorItemReader, FlatFileItemReader, etc.
@Bean
public JdbcPagingItemReader<CustomerCredit> itemReader(DataSource dataSource,
PagingQueryProvider queryProvider) {
Map<String, Object> params = new HashMap<>();
params.put("status", "NEW");
return new JdbcPagingItemReaderBuilder<CustomerCredit>()
.name("creditReader")
.dataSource(dataSource)
.queryProvider(queryProvider)
.parameterValues(params)
.rowMapper(customerCreditMapper())
.pageSize(1000)
.build();
}Chunk Processing
Spring Batch groups items into chunks (e.g., size 10). The framework reads items, processes them, and writes the entire chunk in a single transaction, reducing I/O overhead.
Skip Strategy and Failure Handling
Configure skipLimit(), skip() and noSkip() to control which exceptions can be ignored and how many can be skipped before a step fails.
Preventing Automatic Job Startup
Set spring.batch.job.enabled=false in application.properties to stop jobs from running on application start.
spring.batch.job.enabled=falseMemory Exhaustion
If a reader loads all records at once, the JVM may run out of heap memory. Use a paging reader (e.g., JdbcPagingItemReader) or increase the JVM heap size.
Best Practices
Design a simple, maintainable batch architecture.
Keep processing close to the data source to reduce I/O.
Cache data when possible to minimize repeated reads.
Avoid duplicate work; aggregate results during the initial processing.
Allocate sufficient heap memory at startup.
Assume worst‑case data integrity; add validation and checksums.
Perform stress testing with realistic data volumes.
Plan and test backup strategies for both databases and files.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
