Master Spring Batch: Core Concepts, Architecture, and Best Practices

This article provides a comprehensive overview of Spring Batch, covering its purpose, architecture, core components such as Job, Step, ItemReader/Writer/Processor, chunk processing, skip strategies, practical guidelines, and how to control job execution in a Spring application.

Java High-Performance Architecture
Java High-Performance Architecture
Java High-Performance Architecture
Master Spring Batch: Core Concepts, Architecture, and Best Practices

Spring Batch Overview

Spring Batch is a lightweight, comprehensive batch processing framework provided by Spring, designed for enterprise applications that require automated, high‑volume data processing without user interaction.

Automates large‑scale data handling, often time‑based (e.g., month‑end calculations, notifications).

Repeats complex business rules on massive datasets (e.g., insurance benefit calculations).

Integrates data from internal and external systems, formatting and validating it before persisting.

Spring Batch offers reusable features essential for large data volumes, including record tracking, transaction management, job statistics, restartability, skip logic, and resource management. It also provides advanced services such as partitioning for high‑throughput jobs.

Spring Batch Architecture

A typical batch job reads records from a database, file, or queue, processes them, and writes the results back.

The overall architecture consists of a Job composed of multiple Step s. Each step defines its own ItemReader, ItemProcessor, and ItemWriter. Jobs are stored in a JobRepository and launched via a JobLauncher.

Core Concepts

Job

A Job represents the entire batch process. It is an interface with methods such as getName(), isRestartable(), and execute(JobExecution).

public interface Job {
    String getName();
    boolean isRestartable();
    void execute(JobExecution execution);
    JobParametersIncrementer getJobParametersIncrementer();
    JobParametersValidator getJobParametersValidator();
}

Jobs can be simple ( SimpleJob) or flow‑based. A typical Java configuration example:

@Bean
public Job footballJob() {
    return jobBuilderFactory.get("footballJob")
        .start(playerLoad())
        .next(gameLoad())
        .next(playerSummarization())
        .end()
        .build();
}

JobInstance

A JobInstance identifies a logical execution of a job with a specific set of parameters.

public interface JobInstance {
    long getInstanceId();
    String getJobName();
}

JobParameters

JobParameters

are used to distinguish different instances of the same job, such as a run date.

JobExecution

JobExecution

represents a single attempt to run a job, containing status, start/end times, and the associated JobParameters.

public interface JobExecution {
    long getExecutionId();
    String getJobName();
    BatchStatus getBatchStatus();
    Date getStartTime();
    Date getEndTime();
    String getExitStatus();
    Date getCreateTime();
    Date getLastUpdatedTime();
    Properties getJobParameters();
}

The BatchStatus enum includes STARTING, STARTED, STOPPING, STOPPED, FAILED, COMPLETED, ABANDONED.

Step and StepExecution

A Step encapsulates a distinct phase of a job. Each step has a corresponding StepExecution that records its runtime details.

ExecutionContext

Both JobExecution and StepExecution have an ExecutionContext for storing key‑value data needed during processing.

ExecutionContext ecStep = stepExecution.getExecutionContext();
ExecutionContext ecJob = jobExecution.getExecutionContext();

JobRepository and JobLauncher

JobRepository

persists jobs, steps, and their executions. JobLauncher starts a job with given parameters.

public interface JobLauncher {
    JobExecution run(Job job, JobParameters jobParameters)
        throws JobExecutionAlreadyRunningException, JobRestartException,
               JobInstanceAlreadyCompleteException, JobParametersInvalidException;
}

ItemReader, ItemWriter, ItemProcessor

ItemReader

reads input data, ItemProcessor applies business logic, and ItemWriter writes the output. Spring Batch provides many implementations, e.g., JdbcPagingItemReader and JdbcCursorItemReader.

@Bean
public JdbcPagingItemReader itemReader(DataSource dataSource, PagingQueryProvider queryProvider) {
    Map<String, Object> params = new HashMap<>();
    params.put("status", "NEW");
    return new JdbcPagingItemReaderBuilder<CustomerCredit>()
        .name("creditReader")
        .dataSource(dataSource)
        .queryProvider(queryProvider)
        .parameterValues(params)
        .rowMapper(customerCreditMapper())
        .pageSize(1000)
        .build();
}

Chunk Processing

Chunk‑oriented processing groups a set of items (the chunk) and commits the transaction only after the chunk size is reached, improving performance.

Skip Strategy and Failure Handling

Configure skipLimit, skip, and noSkip to control which exceptions can be ignored during step execution.

Batch Operation Guidelines

Keep batch architecture simple; avoid overly complex logic in a single job.

Process data close to where it is stored to reduce I/O.

Minimize resource usage, especially I/O, by performing as much work in memory as possible.

Analyze SQL to avoid unnecessary scans and ensure proper indexing.

Allocate sufficient memory at job start to prevent runtime reallocations.

Assume worst‑case data integrity; add validation and checksums.

Perform stress testing with realistic data volumes.

Plan and test backup strategies for both databases and files.

Preventing Automatic Job Startup

Set spring.batch.job.enabled=false in application.properties to stop jobs from running on application startup.

Handling Memory Exhaustion

If a reader loads all records at once, the JVM may run out of heap memory. Resolve by paging the reader or increasing the service’s memory allocation.

Source: blog.csdn.net/topdeveloperr/article/details/84337956

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaBatch ProcessingSpring FrameworkJobChunkSpring BatchStep
Java High-Performance Architecture
Written by

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.