Integrating Tess4J OCR into a Spring Boot Application
This guide explains how to set up a Spring Boot project, add the Tess4J dependency, configure language data, implement an OCR service and REST controller, and test both local file uploads and remote image URLs for text recognition.
This article demonstrates how to integrate Tess4J, a Java wrapper for the Tesseract OCR engine, into a Spring Boot application to recognize text from both local and remote images.
Background : OCR is increasingly used for data entry and automation. Tess4J provides a powerful Java interface to the Tesseract engine, and embedding it in Spring Boot enables quick, elegant text extraction.
Environment Setup : Ensure JDK 1.8+, Maven, the latest Spring Boot version, and Tess4J 4.x or newer are installed.
Adding the Dependency :
<dependencies>
<dependency>
<groupId>net.sourceforge.tess4j</groupId>
<artifactId>tess4j</artifactId>
<version>4.5.4</version>
</dependency>
<!-- other dependencies -->
</dependencies>Download the appropriate tessdata language packs from the official repository or a shared cloud link and place them where the application can access them.
Creating the OCR Service :
@Service
public class OcrService {
public String recognizeText(File imageFile) throws TesseractException {
Tesseract tesseract = new Tesseract();
// Set the path to tessdata (optional for standard English)
tesseract.setDatapath("your/tessdata/path");
tesseract.setLanguage("chi_sim");
return tesseract.doOCR(imageFile);
}
public String recognizeTextFromUrl(String imageUrl) throws Exception {
URL url = new URL(imageUrl);
InputStream in = url.openStream();
Files.copy(in, Paths.get("downloaded.jpg"), StandardCopyOption.REPLACE_EXISTING);
File imageFile = new File("downloaded.jpg");
return recognizeText(imageFile);
}
}The recognizeText method processes a local file, while recognizeTextFromUrl downloads a remote image before performing OCR.
Building the REST Controller :
@RestController
@RequestMapping("/api/ocr")
public class OcrController {
private final OcrService ocrService;
public OcrController(OcrService ocrService) {
this.ocrService = ocrService;
}
@PostMapping("/upload")
public ResponseEntity
uploadImage(@RequestParam("file") MultipartFile file) {
try {
File convFile = new File(System.getProperty("java.io.tmpdir") + "/" + file.getOriginalFilename());
file.transferTo(convFile);
String result = ocrService.recognizeText(convFile);
return ResponseEntity.ok(result);
} catch (Exception e) {
e.printStackTrace();
return ResponseEntity.badRequest().body("Recognition error: " + e.getMessage());
}
}
@GetMapping("/recognize-url")
public ResponseEntity
recognizeFromUrl(@RequestParam("imageUrl") String imageUrl) {
try {
String result = ocrService.recognizeTextFromUrl(imageUrl);
return ResponseEntity.ok(result);
} catch (Exception e) {
e.printStackTrace();
return ResponseEntity.badRequest().body("URL recognition error: " + e.getMessage());
}
}
}The controller exposes two endpoints: /api/ocr/upload for local file uploads and /api/ocr/recognize-url for processing images from a URL.
Testing : Use tools like Postman or curl to POST an image file to the upload endpoint and GET the URL endpoint with an image URL parameter. Screenshots in the original article illustrate successful local and remote tests.
Conclusion : After following these steps you have a functional Spring Boot service capable of OCR on both local and remote images. Adjust the language packs and configuration as needed for multilingual scenarios, and remember that OCR accuracy can be further improved with better preprocessing.
Java Architect Essentials
Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.