Convert Word (.docx) to PDF in Spring Boot with docx4j
This guide walks you through a fully open‑source, pure‑Java solution for converting uploaded .docx files to PDF in a Spring Boot application, covering library selection, Maven dependencies, a reusable utility class, controller implementation, and handling Chinese font issues on Windows and Linux.
When a user uploads a Word (.docx) file and expects the backend to generate a PDF for download or preview, many solutions exist—from paid services like Aspose to heavyweight tools like LibreOffice. This article presents a completely open‑source, lightweight approach using the pure‑Java library docx4j , which integrates smoothly with Spring Boot.
1. Solution Comparison
Apache POI + iText : Open source, no external dependencies, medium style fidelity, simple deployment, but poor support for complex formats.
docx4j : Open source, no external dependencies, high style fidelity, moderate deployment complexity, recommended for pure Java projects.
LibreOffice + JODConverter : Open source but requires installing LibreOffice, very high fidelity, higher deployment complexity.
Aspose.Words : Commercial, no external dependencies, highest fidelity, but requires a paid license.
2. Add Maven Dependencies
<dependencies>
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-core</artifactId>
<version>11.4.8</version>
</dependency>
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-JAXB-ReferenceImpl</artifactId>
<version>11.4.8</version>
</dependency>
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-export-fo</artifactId>
<version>11.4.8</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>2.0.9</version>
</dependency>
</dependencies>3. Core Utility Class: DocxToPdfUtil
Create DocxToPdfUtil.java under the utils package:
package com.donglin.utils;
import org.docx4j.Docx4J;
import org.docx4j.fonts.*;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import java.io.*;
public class DocxToPdfUtil {
/**
* Convert a .docx file to PDF.
* @param docxPath input file path
* @param pdfPath output file path
*/
public static void convert(String docxPath, String pdfPath) {
try {
// 1. Load the Word document
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new File(docxPath));
// 2. Configure font mapper to avoid Chinese garbled output
Mapper fontMapper = new IdentityPlusMapper();
PhysicalFonts.discoverPhysicalFonts();
PhysicalFont simsun = PhysicalFonts.get("SimSun");
if (simsun != null) {
fontMapper.put("SimSun", simsun);
fontMapper.put("隶书", PhysicalFonts.get("LiSu"));
fontMapper.put("宋体", PhysicalFonts.get("SimSun"));
fontMapper.put("微软雅黑", PhysicalFonts.get("Microsoft YaHei"));
fontMapper.put("黑体", PhysicalFonts.get("SimHei"));
fontMapper.put("楷体", PhysicalFonts.get("KaiTi"));
fontMapper.put("新宋体", PhysicalFonts.get("NSimSun"));
fontMapper.put("华文行楷", PhysicalFonts.get("STXingkai"));
fontMapper.put("华文仿宋", PhysicalFonts.get("STFangsong"));
fontMapper.put("仿宋", PhysicalFonts.get("FangSong"));
fontMapper.put("幼圆", PhysicalFonts.get("YouYuan"));
fontMapper.put("华文宋体", PhysicalFonts.get("STSong"));
fontMapper.put("华文中宋", PhysicalFonts.get("STZhongsong"));
fontMapper.put("等线", PhysicalFonts.get("SimSun"));
fontMapper.put("等线 Light", PhysicalFonts.get("SimSun"));
fontMapper.put("华文琥珀", PhysicalFonts.get("STHupo"));
fontMapper.put("华文隶书", PhysicalFonts.get("STLiti"));
fontMapper.put("华文新魏", PhysicalFonts.get("STXinwei"));
fontMapper.put("华文彩云", PhysicalFonts.get("STCaiyun"));
fontMapper.put("方正姚体", PhysicalFonts.get("FZYaoti"));
fontMapper.put("方正舒体", PhysicalFonts.get("FZShuTi"));
fontMapper.put("华文细黑", PhysicalFonts.get("STXihei"));
fontMapper.put("宋体扩展", PhysicalFonts.get("simsun-extB"));
fontMapper.put("仿宋_GB2312", PhysicalFonts.get("FangSong_GB2312"));
fontMapper.put("新細明體", PhysicalFonts.get("SimSun"));
// Fix "SimSun" garbled characters in titles
PhysicalFonts.put("PMingLiU", PhysicalFonts.get("SimSun"));
PhysicalFonts.put("新細明體", PhysicalFonts.get("SimSun"));
wordMLPackage.setFontMapper(fontMapper);
}
// 3. Write PDF to output stream
try (FileOutputStream os = new FileOutputStream(pdfPath)) {
Docx4J.toPDF(wordMLPackage, os);
}
System.out.println("✅ PDF generated: " + pdfPath);
} catch (Exception e) {
System.err.println("❌ Conversion failed: " + e.getMessage());
}
}
}4. Controller Example
Add an upload endpoint in the controller package:
import org.springframework.web.bind.annotation.*;
import org.springframework.web.multipart.MultipartFile;
import javax.servlet.http.HttpServletResponse;
import java.io.*;
@RestController
@RequestMapping("/convert")
public class FileController {
@GetMapping("/convertToPdf")
public void convertToPdf(@RequestParam String filePath, HttpServletResponse response) throws Exception {
// 1. Verify file existence
File inputFile = new File(filePath);
if (!inputFile.exists()) {
throw new RuntimeException("File not found: " + filePath);
}
// 2. Define temporary PDF path
String pdfPath = filePath.replace(".docx", ".pdf");
// 3. Perform conversion
DocxToPdfUtil.convert(filePath, pdfPath);
// 4. Return PDF as download
response.setContentType("application/pdf");
response.setHeader("Content-Disposition", "attachment; filename=" + new File(pdfPath).getName());
try (FileInputStream fis = new FileInputStream(pdfPath);
OutputStream os = response.getOutputStream()) {
fis.transferTo(os);
os.flush();
}
// Optional: clean up temporary file
new File(pdfPath).delete();
}
}5. Windows Chinese‑Font Garbling Fix
Map common Chinese fonts in the utility class (see the fontMapper.put(...) calls above) to ensure characters render correctly on Windows.
6. Linux Chinese‑Font Garbling Fix
Install Windows fonts on the Linux host:
sudo mkdir -p /usr/share/fonts/win_font
# Copy *.ttf files from Windows (C:\Windows\Fonts) to the directory above
cd /usr/share/fonts/win_font
sudo mkfontscale # generate font scale file
sudo mkfontdir # generate font directory index
sudo fc-cache -fv # refresh font cacheVerify installation:
fc-list :lang=zh7. Summary
Use the open‑source docx4j library—no need for Office or LibreOffice.
All required Maven coordinates are listed above.
The DocxToPdfUtil class handles document loading, font mapping, and PDF generation.
A simple Spring Boot controller exposes a GET endpoint for conversion.
Chinese font mapping resolves garbled output on both Windows and Linux environments.
The solution is lightweight, high‑performance, and easy to deploy.
Java Architect Handbook
Focused on Java interview questions and practical article sharing, covering algorithms, databases, Spring Boot, microservices, high concurrency, JVM, Docker containers, and ELK-related knowledge. Looking forward to progressing together with you.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
