Master Spring Batch: Core Concepts, Architecture, and Best Practices
This article provides a comprehensive overview of Spring Batch, covering its purpose, architecture, core components such as Job, Step, ItemReader/Writer/Processor, chunk processing, skip strategies, practical guidelines, and how to control job execution in a Spring application.
Spring Batch Overview
Spring Batch is a lightweight, comprehensive batch processing framework provided by Spring, designed for enterprise applications that require automated, high‑volume data processing without user interaction.
Automates large‑scale data handling, often time‑based (e.g., month‑end calculations, notifications).
Repeats complex business rules on massive datasets (e.g., insurance benefit calculations).
Integrates data from internal and external systems, formatting and validating it before persisting.
Spring Batch offers reusable features essential for large data volumes, including record tracking, transaction management, job statistics, restartability, skip logic, and resource management. It also provides advanced services such as partitioning for high‑throughput jobs.
Spring Batch Architecture
A typical batch job reads records from a database, file, or queue, processes them, and writes the results back.
The overall architecture consists of a Job composed of multiple Step s. Each step defines its own ItemReader, ItemProcessor, and ItemWriter. Jobs are stored in a JobRepository and launched via a JobLauncher.
Core Concepts
Job
A Job represents the entire batch process. It is an interface with methods such as getName(), isRestartable(), and execute(JobExecution).
public interface Job {
String getName();
boolean isRestartable();
void execute(JobExecution execution);
JobParametersIncrementer getJobParametersIncrementer();
JobParametersValidator getJobParametersValidator();
}Jobs can be simple ( SimpleJob) or flow‑based. A typical Java configuration example:
@Bean
public Job footballJob() {
return jobBuilderFactory.get("footballJob")
.start(playerLoad())
.next(gameLoad())
.next(playerSummarization())
.end()
.build();
}JobInstance
A JobInstance identifies a logical execution of a job with a specific set of parameters.
public interface JobInstance {
long getInstanceId();
String getJobName();
}JobParameters
JobParametersare used to distinguish different instances of the same job, such as a run date.
JobExecution
JobExecutionrepresents a single attempt to run a job, containing status, start/end times, and the associated JobParameters.
public interface JobExecution {
long getExecutionId();
String getJobName();
BatchStatus getBatchStatus();
Date getStartTime();
Date getEndTime();
String getExitStatus();
Date getCreateTime();
Date getLastUpdatedTime();
Properties getJobParameters();
}The BatchStatus enum includes STARTING, STARTED, STOPPING, STOPPED, FAILED, COMPLETED, ABANDONED.
Step and StepExecution
A Step encapsulates a distinct phase of a job. Each step has a corresponding StepExecution that records its runtime details.
ExecutionContext
Both JobExecution and StepExecution have an ExecutionContext for storing key‑value data needed during processing.
ExecutionContext ecStep = stepExecution.getExecutionContext();
ExecutionContext ecJob = jobExecution.getExecutionContext();JobRepository and JobLauncher
JobRepositorypersists jobs, steps, and their executions. JobLauncher starts a job with given parameters.
public interface JobLauncher {
JobExecution run(Job job, JobParameters jobParameters)
throws JobExecutionAlreadyRunningException, JobRestartException,
JobInstanceAlreadyCompleteException, JobParametersInvalidException;
}ItemReader, ItemWriter, ItemProcessor
ItemReaderreads input data, ItemProcessor applies business logic, and ItemWriter writes the output. Spring Batch provides many implementations, e.g., JdbcPagingItemReader and JdbcCursorItemReader.
@Bean
public JdbcPagingItemReader itemReader(DataSource dataSource, PagingQueryProvider queryProvider) {
Map<String, Object> params = new HashMap<>();
params.put("status", "NEW");
return new JdbcPagingItemReaderBuilder<CustomerCredit>()
.name("creditReader")
.dataSource(dataSource)
.queryProvider(queryProvider)
.parameterValues(params)
.rowMapper(customerCreditMapper())
.pageSize(1000)
.build();
}Chunk Processing
Chunk‑oriented processing groups a set of items (the chunk) and commits the transaction only after the chunk size is reached, improving performance.
Skip Strategy and Failure Handling
Configure skipLimit, skip, and noSkip to control which exceptions can be ignored during step execution.
Batch Operation Guidelines
Keep batch architecture simple; avoid overly complex logic in a single job.
Process data close to where it is stored to reduce I/O.
Minimize resource usage, especially I/O, by performing as much work in memory as possible.
Analyze SQL to avoid unnecessary scans and ensure proper indexing.
Allocate sufficient memory at job start to prevent runtime reallocations.
Assume worst‑case data integrity; add validation and checksums.
Perform stress testing with realistic data volumes.
Plan and test backup strategies for both databases and files.
Preventing Automatic Job Startup
Set spring.batch.job.enabled=false in application.properties to stop jobs from running on application startup.
Handling Memory Exhaustion
If a reader loads all records at once, the JVM may run out of heap memory. Resolve by paging the reader or increasing the service’s memory allocation.
Source: blog.csdn.net/topdeveloperr/article/details/84337956
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
