Backend Development 18 min read

Master Spring Batch: From Basics to Advanced Architecture and Best Practices

This article introduces Spring Batch, a lightweight Java batch‑processing framework, explains its architecture and core concepts such as Job, Step, ItemReader/Writer/Processor, chunk handling, skip strategies, and provides practical guidelines for building reliable, high‑throughput batch jobs while avoiding common pitfalls like memory exhaustion.

Java High-Performance Architecture

Aug 1, 2022

Master Spring Batch: From Basics to Advanced Architecture and Best Practices

Spring Batch Overview

Spring Batch is a lightweight, comprehensive batch processing framework provided by Spring, designed for enterprise applications that require high‑volume, automated, and reliable data processing without user interaction.

Automated processing of large data sets based on time‑driven events (e.g., month‑end calculations).

Periodic execution of complex business rules on massive data (e.g., insurance calculations).

Integration of data from internal and external systems, formatting, validation, and transactional writing.

Spring Batch offers reusable features such as transaction management, job restart, skip logic, and resource management, and it supports high‑throughput processing through partitioning and optimization techniques.

Spring Batch Architecture

A typical batch job reads records from a database, file, or queue, processes them, and writes the results back. The following diagram illustrates the overall flow.

The framework persists job metadata (JobInstance, JobExecution, StepExecution) in tables like batch_job_execution.

Core Concepts

Job

A Job represents the entire batch process. It is an interface with methods such as getName(), isRestartable(), and execute(JobExecution). Implementations include SimpleJob and FlowJob.

public interface Job {
    String getName();
    boolean isRestartable();
    void execute(JobExecution execution);
    JobParametersIncrementer getJobParametersIncrementer();
    JobParametersValidator getJobParametersValidator();
}

JobInstance

JobInstance identifies a logical execution of a Job, distinguished by parameters. It provides getInstanceId() and getJobName().

public interface JobInstance {
    long getInstanceId();
    String getJobName();
}

JobParameters

JobParameters hold the parameters that launch a JobInstance, allowing each run (e.g., daily EndOfDay job) to be uniquely identified.

JobExecution

JobExecution represents a single attempt to run a Job. It provides execution ID, job name, batch status, start/end times, exit status, and the associated JobParameters.

public interface JobExecution {
    long getExecutionId();
    String getJobName();
    BatchStatus getBatchStatus();
    Date getStartTime();
    Date getEndTime();
    String getExitStatus();
    Date getCreateTime();
    Date getLastUpdatedTime();
    Properties getJobParameters();
}

Step and StepExecution

A Step is a distinct phase of a Job; StepExecution records each execution of a Step, including its ExecutionContext.

ExecutionContext

ExecutionContext stores key‑value pairs for a StepExecution or JobExecution, useful for restart data.

JobRepository and JobLauncher

JobRepository persists metadata for Jobs and Steps. JobLauncher starts a Job with given parameters.

public interface JobLauncher {
    JobExecution run(Job job, JobParameters jobParameters) throws JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException, JobParametersInvalidException;
}

ItemReader, ItemProcessor, ItemWriter

ItemReader reads input data, ItemProcessor applies business logic, and ItemWriter writes output. Spring Batch provides implementations such as JdbcPagingItemReader and JdbcCursorItemReader.

@Bean
public JdbcPagingItemReader itemReader(DataSource dataSource, PagingQueryProvider queryProvider) {
    Map<String, Object> params = new HashMap<>();
    params.put("status", "NEW");
    return new JdbcPagingItemReaderBuilder<CustomerCredit>()
        .name("creditReader")
        .dataSource(dataSource)
        .queryProvider(queryProvider)
        .parameterValues(params)
        .rowMapper(customerCreditMapper())
        .pageSize(1000)
        .build();
}

Chunk Processing

Spring Batch processes records in chunks. A chunk size defines how many items are read and processed before a single transaction commit.

Skip Strategy and Failure Handling

skipLimit

defines how many exceptions a Step may skip before failing. skip specifies which exceptions can be ignored, while noSkip marks exceptions that must cause failure.

Note: If skipLimit is not set, the default is 0.

Batch Processing Guidelines

Simplify job logic and keep processing close to the data.

Minimize I/O and use in‑memory operations where possible.

Avoid redundant processing; allocate sufficient memory at startup.

Assume worst‑case data integrity and implement checksums.

Perform stress testing with realistic data volumes.

Plan backup strategies for both databases and files.

Disabling Automatic Job Startup

Set spring.batch.job.enabled=false in application.properties to prevent jobs from running on application start.

Handling Out‑of‑Memory Errors

When a reader loads all records at once, the JVM may run out of heap memory. Solutions include paging the reader or increasing service memory.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java Batch Processing Job Chunk Spring Batch Skip Strategy

Written by

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.