Master Spring Batch: Core Concepts, Architecture, and Best Practices
This article provides a comprehensive overview of Spring Batch, covering its purpose, architecture, core components such as Job, Step, ItemReader/Writer/Processor, execution contexts, chunk processing, skip strategies, and practical tips for configuration and memory management.
Spring Batch Overview
Spring Batch is a lightweight, comprehensive batch processing framework provided by Spring, designed for building robust batch applications essential to enterprise daily operations. It handles large‑scale data processing without user interaction, supports complex business rules, and integrates data from internal and external systems.
Spring Batch Architecture Overview
A typical batch application reads a large number of records from a database, file, or queue, processes the data, and writes the results back. The following diagram illustrates the overall flow:
The overall architecture of Spring Batch consists of Jobs composed of multiple Steps. Each Step can define its own ItemReader, ItemProcessor, and ItemWriter. Jobs are stored in a JobRepository and launched via a JobLauncher.
Core Concepts of Spring Batch
What is a Job
A Job represents the entire batch process and is the top‑level abstraction. It contains one or more Steps and can be configured with listeners, restart policies, and parameters.
/**
* Batch domain object representing a job. Job is an explicit abstraction
* representing the configuration of a job specified by a developer.
*/
public interface Job {
String getName();
boolean isRestartable();
void execute(JobExecution execution);
JobParametersIncrementer getJobParametersIncrementer();
JobParametersValidator getJobParametersValidator();
}A simple implementation is SimpleJob, which provides default behavior.
@Bean
public Job footballJob() {
return this.jobBuilderFactory.get("footballJob")
.start(playerLoad())
.next(gameLoad())
.next(playerSummarization())
.end()
.build();
}What is a JobInstance
A JobInstance uniquely identifies a job definition with a specific set of parameters.
public interface JobInstance {
/** Get unique id for this JobInstance. */
long getInstanceId();
/** Get job name. */
String getJobName();
}What is a JobParameters
JobParametershold the values used to launch a job, allowing each execution to be distinguished (e.g., by date).
What is a JobExecution
A JobExecution represents a single attempt to run a job, containing status, start/end times, and the associated JobParameters.
public interface JobExecution {
long getExecutionId();
String getJobName();
BatchStatus getBatchStatus();
Date getStartTime();
Date getEndTime();
String getExitStatus();
Date getCreateTime();
Date getLastUpdatedTime();
Properties getJobParameters();
}The batch status enum includes
STARTING, STARTED, STOPPING, STOPPED, FAILED, COMPLETED, ABANDONED.
What is a Step
A Step encapsulates an independent phase of a batch job. Each Step can have its own reader, processor, and writer.
What is a StepExecution
A StepExecution records the runtime details of a Step, including its status, commit count, and timestamps.
What is an ExecutionContext
An ExecutionContext stores key‑value pairs for a Step or Job, enabling data sharing and restartability.
ExecutionContext ecStep = stepExecution.getExecutionContext();
ExecutionContext ecJob = jobExecution.getExecutionContext();What is a JobRepository
The JobRepository persists Jobs, Steps, and their executions, providing CRUD operations for the batch infrastructure.
What is a JobLauncher
The JobLauncher starts a Job with given parameters.
public interface JobLauncher {
JobExecution run(Job job, JobParameters jobParameters)
throws JobExecutionAlreadyRunningException, JobRestartException,
JobInstanceAlreadyCompleteException, JobParametersInvalidException;
}What is an ItemReader
An ItemReader abstracts data input for a Step. Spring Batch offers many implementations such as JdbcPagingItemReader and JdbcCursorItemReader.
@Bean
public JdbcPagingItemReader itemReader(DataSource dataSource, PagingQueryProvider queryProvider) {
Map<String, Object> parameterValues = new HashMap<>();
parameterValues.put("status", "NEW");
return new JdbcPagingItemReaderBuilder<CustomerCredit>()
.name("creditReader")
.dataSource(dataSource)
.queryProvider(queryProvider)
.parameterValues(parameterValues)
.rowMapper(customerCreditMapper())
.pageSize(1000)
.build();
}What is an ItemWriter
An ItemWriter abstracts data output. It can write one record at a time or a chunk of records.
What is an ItemProcessor
An ItemProcessor applies business logic between reading and writing; returning null skips the item.
Chunk Processing
Chunk processing groups a configurable number of items before committing them as a single transaction, improving performance.
Skip Strategy and Failure Handling
Skip policies allow a Step to ignore a limited number of exceptions. skipLimit() sets the maximum number of skips, skip() defines which exceptions can be skipped, and noSkip() excludes exceptions from being skipped.
Batch Processing Guidelines
Design the batch architecture to minimize complexity.
Keep data processing close to storage to reduce I/O.
Maximize in‑memory operations and limit unnecessary I/O.
Analyze SQL statements to avoid redundant scans and missing indexes.
Avoid duplicate work; aggregate data during the initial processing phase.
Allocate sufficient memory at startup to prevent runtime reallocations.
Assume worst‑case data integrity; add validation and checksums.
Conduct performance testing with realistic data volumes.
Plan and test backup strategies for both databases and files.
How to Prevent Job Auto‑Start
By default, Spring Batch runs all defined jobs on application startup. To disable this behavior, add the following property:
spring.batch.job.enabled=falseOut‑of‑Memory When Reading Data
If a job reads all records at once without paging, the JVM may run out of heap memory, resulting in a "Resource exhaustion event". Solutions include implementing a paging ItemReader or increasing the JVM heap size.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
macrozheng
Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
