Spring Batch Basics: Building Efficient SpringBoot Batch Jobs

This article explains why naive for‑loop DB operations fail on large data sets, introduces Spring Batch’s chunk, transaction, retry, skip and restart features, and provides step‑by‑step SpringBoot configurations, code samples for Tasklet and Chunk jobs, database and CSV readers/writers, and manual or scheduled job triggering.

Java Tech Workshop
Java Tech Workshop
Java Tech Workshop
Spring Batch Basics: Building Efficient SpringBoot Batch Jobs

Problems with naive for‑loop batch implementations

Processing millions of rows with a single for loop and immediate DB writes leads to:

Out‑of‑memory (OOM) errors.

Very high I/O because each record is written individually.

All data lost on a mid‑process exception (no checkpoint).

No retry or skip mechanism, causing task failure on dirty data.

No task status persistence, making monitoring and reruns impossible.

Spring Batch core concepts

Batch refers to large‑volume, offline, non‑real‑time, repeatable data tasks such as nightly reconciliation, CSV/Excel import‑export, data migration, archiving, and bulk messaging.

Spring Batch provides a four‑layer model: Job → Step → Execution logic → Context . A Job contains one or more sequential Steps . Steps can be implemented as:

Tasklet : a single action for simple jobs (e.g., table truncation).

Chunk : the read‑process‑write pattern for massive data handling.

Chunk processing follows a three‑phase flow: ItemReader → ItemProcessor → ItemWriter . Each chunk reads N items, processes them, writes them in batch, and commits a single transaction, preventing OOM and enabling atomic commits.

Spring Batch automatically creates metadata tables (≈10) to record job name, batch number, status, start/end time, progress, and failure point, which enables restart, idempotency, and monitoring.

Environment setup & configuration

Maven core dependencies

<!-- Spring Boot core -->
<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter</artifactId>
</dependency>

<!-- Spring Batch core -->
<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-batch</artifactId>
</dependency>

<!-- JDBC for metadata persistence -->
<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-jdbc</artifactId>
</dependency>

<!-- MySQL driver (runtime) -->
<dependency>
  <groupId>com.mysql</groupId>
  <artifactId>mysql-connector-j</artifactId>
  <scope>runtime</scope>
</dependency>

<!-- Druid connection pool -->
<dependency>
  <groupId>com.alibaba</groupId>
  <artifactId>druid-spring-boot-starter</artifactId>
  <version>1.2.16</version>
</dependency>

<!-- CSV utilities -->
<dependency>
  <groupId>org.apache.commons</groupId>
  <artifactId>commons-csv</artifactId>
  <version>1.10.0</version>
</dependency>

<!-- Lombok (optional) -->
<dependency>
  <groupId>org.projectlombok</groupId>
  <artifactId>lombok</artifactId>
  <optional>true</optional>
</dependency>

application.yml configuration

spring:
  datasource:
    type: com.alibaba.druid.pool.DruidDataSource
    driver-class-name: com.mysql.cj.jdbc.Driver
    url: jdbc:mysql://127.0.0.1:3306/batch_db?useUnicode=true&serverTimezone=Asia/Shanghai&allowMultiQueries=true
    username: root
    password: root

  batch:
    job:
      enabled: false   # disable auto‑run, allow manual or scheduled launch
    initialize-schema: always
    jdbc:
      initialize-schema: always

Enable batch processing in the main class

import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
@EnableBatchProcessing
public class BatchApplication {
    public static void main(String[] args) {
        SpringApplication.run(BatchApplication.class, args);
    }
}

Tasklet – simple one‑step job

Custom Tasklet

import org.springframework.batch.core.StepContribution;
import org.springframework.batch.core.scope.context.ChunkContext;
import org.springframework.batch.core.step.tasklet.Tasklet;
import org.springframework.batch.repeat.RepeatStatus;
import org.springframework.stereotype.Component;

@Component
public class CleanTempDataTask implements Tasklet {
    @Override
    public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) {
        System.out.println("[Tasklet] Execute temporary data cleanup and cache refresh");
        // business logic: truncate temp tables, delete expired data, refresh configs, etc.
        return RepeatStatus.FINISHED;
    }
}

Job and Step assembly

import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.job.builder.JobBuilder;
import org.springframework.batch.core.repository.JobRepository;
import org.springframework.batch.core.step.builder.StepBuilder;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.transaction.PlatformTransactionManager;

@Configuration
public class TaskletJobConfig {
    @Autowired
    private CleanTempDataTask cleanTempDataTask;

    @Bean
    public Step cleanDataStep(JobRepository jobRepository, PlatformTransactionManager transactionManager) {
        return new StepBuilder("clean-data-step", jobRepository)
                .tasklet(cleanTempDataTask, transactionManager)
                .build();
    }

    @Bean
    public Job cleanDataJob(JobRepository jobRepository, Step cleanDataStep) {
        return new JobBuilder("clean-data-job", jobRepository)
                .start(cleanDataStep)
                .build();
    }
}

Chunk – standard batch processing

Domain model

CREATE TABLE user_info (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    username VARCHAR(32),
    age INT,
    email VARCHAR(64),
    status TINYINT
);
import lombok.Data;

@Data
public class UserInfo {
    private Long id;
    private String username;
    private Integer age;
    private String email;
    private Integer status;
}

ItemReader – in‑memory example

import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.support.ListItemReader;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import java.util.ArrayList;
import java.util.List;

@Configuration
public class UserReaderConfig {
    @Bean
    public ItemReader<UserInfo> userInfoReader() {
        List<UserInfo> list = new ArrayList<>();
        for (int i = 1; i <= 50; i++) {
            UserInfo user = new UserInfo();
            user.setUsername("user_" + i);
            user.setAge(20 + i % 10);
            user.setEmail("user" + i + "@qq.com");
            user.setStatus(1);
            list.add(user);
        }
        return new ListItemReader<>(list);
    }
}

ItemProcessor – filtering & transformation

import org.springframework.batch.item.ItemProcessor;
import org.springframework.stereotype.Component;

@Component
public class UserInfoProcessor implements ItemProcessor<UserInfo, UserInfo> {
    @Override
    public UserInfo process(UserInfo user) {
        // Filter: discard records with age > 25
        if (user.getAge() > 25) {
            return null;
        }
        // Transform
        user.setUsername(user.getUsername().toUpperCase());
        user.setStatus(2);
        return user;
    }
}

ItemWriter – console output (placeholder for DB batch write)

import org.springframework.batch.item.ItemWriter;
import org.springframework.stereotype.Component;
import java.util.List;

@Component
public class UserInfoWriter implements ItemWriter<UserInfo> {
    @Override
    public void write(List<? extends UserInfo> items) {
        System.out.println("[Batch write] Items count: " + items.size());
        items.forEach(System.out::println);
        // Real implementation would batch insert/update the DB
    }
}

Chunk job assembly

import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.job.builder.JobBuilder;
import org.springframework.batch.core.repository.JobRepository;
import org.springframework.batch.core.step.builder.StepBuilder;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.transaction.PlatformTransactionManager;

@Configuration
public class UserChunkJobConfig {
    @Autowired
    private ItemReader<UserInfo> userInfoReader;
    @Autowired
    private UserInfoProcessor userInfoProcessor;
    @Autowired
    private UserInfoWriter userInfoWriter;

    @Bean
    public Step userChunkStep(JobRepository jobRepository, PlatformTransactionManager transactionManager) {
        return new StepBuilder("user-chunk-step", jobRepository)
                .<UserInfo, UserInfo>chunk(10, transactionManager)
                .reader(userInfoReader)
                .processor(userInfoProcessor)
                .writer(userInfoWriter)
                .faultTolerant()
                .retry(Exception.class).retryLimit(3)
                .skip(Exception.class).skipLimit(100)
                .build();
    }

    @Bean
    public Job userChunkJob(JobRepository jobRepository, Step userChunkStep) {
        return new JobBuilder("user-chunk-job", jobRepository)
                .start(userChunkStep)
                .build();
    }
}

chunk(10) means each transaction processes ten records, providing high throughput while keeping memory usage low.

Fault tolerance – retry, skip, and restart

Exception retry configuration

.faultTolerant()
.retry(Exception.class)
.retryLimit(3)

Skip dirty data configuration

.skip(Exception.class)
.skipLimit(100)

Complete fault‑tolerant step definition

@Bean
public Step userChunkStep(JobRepository jobRepository, PlatformTransactionManager transactionManager) {
    return new StepBuilder("user-chunk-step", jobRepository)
            .<UserInfo, UserInfo>chunk(10, transactionManager)
            .reader(userInfoReader)
            .processor(userInfoProcessor)
            .writer(userInfoWriter)
            .faultTolerant()
            .retry(Exception.class).retryLimit(3)
            .skip(Exception.class).skipLimit(100)
            .build();
}

Spring Batch persists the execution point of each chunk; on failure the job restarts from the failed chunk instead of from the beginning, improving stability for massive tasks.

Database batch read/write

JdbcCursorItemReader – streamed DB read

@Bean
public ItemReader<UserInfo> dbUserReader(DataSource dataSource) {
    String sql = "select id,username,age,email,status from user_info";
    return new JdbcCursorItemReaderBuilder<UserInfo>()
            .dataSource(dataSource)
            .sql(sql)
            .rowMapper((rs, rowNum) -> {
                UserInfo user = new UserInfo();
                user.setId(rs.getLong("id"));
                user.setUsername(rs.getString("username"));
                user.setAge(rs.getInt("age"));
                user.setEmail(rs.getString("email"));
                user.setStatus(rs.getInt("status"));
                return user;
            })
            .name("user-db-reader")
            .build();
}

JdbcBatchItemWriter – bulk DB write

@Bean
public ItemWriter<UserInfo> dbUserWriter(DataSource dataSource) {
    String sql = "insert into user_info(username,age,email,status) values (?,?,?,?)";
    return new JdbcBatchItemWriterBuilder<UserInfo>()
            .dataSource(dataSource)
            .sql(sql)
            .itemPreparedStatementSetter((item, ps) -> {
                ps.setString(1, item.getUsername());
                ps.setInt(2, item.getAge());
                ps.setString(3, item.getEmail());
                ps.setInt(4, item.getStatus());
            })
            .build();
}

CSV file import/export

CsvItemReader

@Bean
public FlatFileItemReader<UserInfo> csvReader() {
    return new FlatFileItemReaderBuilder<UserInfo>()
            .resource(new FileSystemResource("data/user.csv"))
            .delimited()
            .names("username", "age", "email")
            .lineMapper(new DefaultLineMapper<>())
            .fieldSetMapper(fieldSet -> {
                UserInfo user = new UserInfo();
                user.setUsername(fieldSet.readString("username"));
                user.setAge(fieldSet.readInt("age"));
                user.setEmail(fieldSet.readString("email"));
                return user;
            })
            .build();
}

CsvItemWriter

@Bean
public FlatFileItemWriter<UserInfo> csvWriter() {
    return new FlatFileItemWriterBuilder<UserInfo>()
            .resource(new FileSystemResource("output/user_out.csv"))
            .delimited()
            .names("id", "username", "age", "email", "status")
            .build();
}

Job triggering methods

Manual REST endpoint

@RestController
@RequestMapping("/batch")
public class BatchController {
    @Autowired
    private JobLauncher jobLauncher;
    @Autowired
    private Job userChunkJob;

    @GetMapping("/run")
    public String runBatch() throws Exception {
        JobParameters params = new JobParametersBuilder()
                .addLong("time", System.currentTimeMillis())
                .toJobParameters();
        jobLauncher.run(userChunkJob, params);
        return "Job executed successfully";
    }
}

Scheduled execution

@Component
@EnableScheduling
public class BatchSchedule {
    @Autowired
    private JobLauncher jobLauncher;
    @Autowired
    private Job userChunkJob;

    // Executes daily at 02:00
    @Scheduled(cron = "0 0 2 * * ?")
    public void scheduleRun() throws Exception {
        JobParameters params = new JobParametersBuilder()
                .addLong("scheduleTime", System.currentTimeMillis())
                .toJobParameters();
        jobLauncher.run(userChunkJob, params);
    }
}

Adding a time‑based parameter guarantees a unique job instance for each launch, allowing repeated executions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Batch ProcessingSchedulingRetrySpringBootChunkSpring BatchTasklet
Java Tech Workshop
Written by

Java Tech Workshop

Focused on Java backend technologies, sharing fundamentals, multithreading, JVM, the Spring ecosystem, microservices, distributed systems, high concurrency, source‑code analysis, and practical experience. Continuously delivers high‑quality original content, interview guides, and learning roadmaps to help Java developers progress from beginner to advanced, enhancing technical skills and core competitiveness.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.