Convert Word (.docx) to PDF in Spring Boot with docx4j – A Complete Guide
Learn how to seamlessly convert uploaded .docx files to PDF in a Spring Boot application using the pure‑Java docx4j library, covering solution comparison, Maven dependencies, a utility class, controller implementation, and Windows/Linux font‑encoding fixes for reliable, high‑fidelity document rendering.
Solution Comparison
Apache POI + iText – Open source, no external dependencies, medium style fidelity, low deployment complexity (⭐). Poor support for complex formats.
docx4j – Open source, pure Java, high style fidelity, moderate deployment complexity (⭐⭐). Recommended for Spring Boot integration.
LibreOffice + JODConverter – Open source, requires LibreOffice installation, very high style fidelity, high deployment complexity (⭐⭐⭐). Complex to deploy.
Aspose.Words – Commercial (non‑open source), no external dependencies, highest style fidelity, low deployment complexity (⭐). Requires a paid license.
Adding Maven Dependencies
Include the following artifacts in pom.xml (version 11.4.8 for docx4j and 2.0.9 for SLF4J):
<dependencies>
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-core</artifactId>
<version>11.4.8</version>
</dependency>
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-JAXB-ReferenceImpl</artifactId>
<version>11.4.8</version>
</dependency>
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-export-fo</artifactId>
<version>11.4.8</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>2.0.9</version>
</dependency>
</dependencies>Core Utility Class: DocxToPdfUtil
Create DocxToPdfUtil.java in a utils package. The class loads a .docx file, configures a comprehensive font mapper to avoid Chinese garbled text, and writes the PDF using Docx4J.toPDF.
package com.donglin.utils;
import org.docx4j.Docx4J;
import org.docx4j.fonts.IdentityPlusMapper;
import org.docx4j.fonts.Mapper;
import org.docx4j.fonts.PhysicalFont;
import org.docx4j.fonts.PhysicalFonts;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import java.io.File;
import java.io.FileOutputStream;
public class DocxToPdfUtil {
/**
* Convert a DOCX file to PDF.
* @param docxPath input file path
* @param pdfPath output file path
*/
public static void convert(String docxPath, String pdfPath) {
try {
// 1. Load the Word document
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new File(docxPath));
// 2. Configure font mapping to prevent Chinese garbled characters
Mapper fontMapper = new IdentityPlusMapper();
PhysicalFonts.discoverPhysicalFonts();
PhysicalFont simsun = PhysicalFonts.get("SimSun");
if (simsun != null) {
fontMapper.put("SimSun", simsun);
// Common Chinese fonts mapping
fontMapper.put("隶书", PhysicalFonts.get("LiSu"));
fontMapper.put("宋体", PhysicalFonts.get("SimSun"));
fontMapper.put("微软雅黑", PhysicalFonts.get("Microsoft YaHei"));
fontMapper.put("黑体", PhysicalFonts.get("SimHei"));
fontMapper.put("楷体", PhysicalFonts.get("KaiTi"));
fontMapper.put("新宋体", PhysicalFonts.get("NSimSun"));
fontMapper.put("华文行楷", PhysicalFonts.get("STXingkai"));
fontMapper.put("华文仿宋", PhysicalFonts.get("STFangsong"));
fontMapper.put("仿宋", PhysicalFonts.get("FangSong"));
fontMapper.put("幼圆", PhysicalFonts.get("YouYuan"));
fontMapper.put("华文宋体", PhysicalFonts.get("STSong"));
fontMapper.put("华文中宋", PhysicalFonts.get("STZhongsong"));
fontMapper.put("等线", PhysicalFonts.get("SimSun"));
fontMapper.put("等线 Light", PhysicalFonts.get("SimSun"));
fontMapper.put("华文琥珀", PhysicalFonts.get("STHupo"));
fontMapper.put("华文隶书", PhysicalFonts.get("STLiti"));
fontMapper.put("华文新魏", PhysicalFonts.get("STXinwei"));
fontMapper.put("华文彩云", PhysicalFonts.get("STCaiyun"));
fontMapper.put("方正姚体", PhysicalFonts.get("FZYaoti"));
fontMapper.put("方正舒体", PhysicalFonts.get("FZShuTi"));
fontMapper.put("华文细黑", PhysicalFonts.get("STXihei"));
fontMapper.put("宋体扩展", PhysicalFonts.get("simsun-extB"));
fontMapper.put("仿宋_GB2312", PhysicalFonts.get("FangSong_GB2312"));
fontMapper.put("新細明體", PhysicalFonts.get("SimSun"));
// Fix specific garbled cases
PhysicalFonts.put("PMingLiU", PhysicalFonts.get("SimSun"));
PhysicalFonts.put("新細明體", PhysicalFonts.get("SimSun"));
wordMLPackage.setFontMapper(fontMapper);
}
// 3. Write PDF output
try (FileOutputStream os = new FileOutputStream(pdfPath)) {
Docx4J.toPDF(wordMLPackage, os);
}
System.out.println("✅ PDF generated successfully: " + pdfPath);
} catch (Exception e) {
System.err.println("❌ Conversion failed: " + e.getMessage());
}
}
}Controller Example
A simple Spring Boot REST controller that receives a file path, invokes the utility, and streams the generated PDF back to the client.
import org.springframework.web.bind.annotation.*;
import javax.servlet.http.HttpServletResponse;
import java.io.File;
import java.io.FileInputStream;
import java.io.OutputStream;
@RestController
@RequestMapping("/convert")
public class FileController {
@GetMapping("/convertToPdf")
public void convertToPdf(@RequestParam String filePath, HttpServletResponse response) throws Exception {
// Verify input file exists
File inputFile = new File(filePath);
if (!inputFile.exists()) {
throw new RuntimeException("File not found: " + filePath);
}
// Define temporary PDF path
String pdfPath = filePath.replace(".docx", ".pdf");
// Perform conversion
DocxToPdfUtil.convert(filePath, pdfPath);
// Return PDF as download
response.setContentType("application/pdf");
response.setHeader("Content-Disposition", "attachment; filename=" + new File(pdfPath).getName());
try (FileInputStream fis = new FileInputStream(pdfPath);
OutputStream os = response.getOutputStream()) {
fis.transferTo(os);
os.flush();
}
// Optional cleanup
new File(pdfPath).delete();
}
}Test with a GET request, e.g.:
GET http://localhost:8080/convert/convertToPdf?filePath=E:/ai/report.docxWindows Chinese Garbled‑Text Fix
The fontMapper defined in DocxToPdfUtil maps a wide range of Chinese fonts (SimSun, LiSu, Microsoft YaHei, etc.) to their physical counterparts, preventing garbled characters when the application runs on Windows.
Linux Chinese Garbled‑Text Fix
On Linux, install Windows TrueType fonts and refresh the font cache so that docx4j can locate the same font families used in the document.
sudo mkdir -p /usr/share/fonts/win_font
# Copy *.ttf files from a Windows installation (C:\Windows\Fonts) to the directory above
cd /usr/share/fonts/win_font
sudo mkfontscale # generate font scale file
sudo mkfontdir # generate font directory index
sudo fc-cache -fv # refresh font cacheVerify the fonts are recognized:
fc-list :lang=zhSummary
Use the open‑source library docx4j (Apache 2.0) for .docx → .pdf conversion.
No need to install Microsoft Office or LibreOffice.
Conversion preserves images, tables, headers/footers, and common Chinese styles.
Lightweight deployment and high performance, suitable for Spring Boot services.
Architect's Guide
Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
