Convert Word to HTML with a Single Spring Boot Line for Online Preview

This article walks through building a Spring Boot 3.5.0 service that uses Apache POI and XDocReport to load a DOCX file, extract images, convert it to HTML, and expose a controller endpoint for instant online preview without external plugins.

Spring Full-Stack Practical Cases
Spring Full-Stack Practical Cases
Spring Full-Stack Practical Cases
Convert Word to HTML with a Single Spring Boot Line for Online Preview

Online preview of Word documents is a common requirement for internal technical docs, reports, or user‑uploaded contracts, but traditional plugins often cause format loss or compatibility issues.

The solution combines Spring Boot 3.5.0 with Apache POI 5.5.1 and XDocReport 2.2.0. Required Maven dependencies are:

<dependency>
  <groupId>org.apache.poi</groupId>
  <artifactId>poi-ooxml</artifactId>
  <version>5.5.1</version>
</dependency>
<dependency>
  <groupId>org.apache.poi</groupId>
  <artifactId>poi-scratchpad</artifactId>
  <version>5.5.1</version>
</dependency>
<dependency>
  <groupId>fr.opensagres.xdocreport</groupId>
  <artifactId>fr.opensagres.poi.xwpf.converter.xhtml</artifactId>
  <version>2.2.0</version>
</dependency>

The loadDocxFromPath method loads a DOCX file, checks that the file exists, verifies that it contains paragraphs or tables, and throws clear exceptions for missing or empty documents:

public XWPFDocument loadDocxFromPath(String path) {
  try {
    Path file = Paths.get(path);
    if (!Files.exists(file)) {
      throw new FileNotFoundException("文件不存在: %s".formatted(path));
    }
    XWPFDocument document = new XWPFDocument(Files.newInputStream(file));
    boolean hasParagraphs = !document.getParagraphs().isEmpty();
    boolean hasTables = !document.getTables().isEmpty();
    if (!hasParagraphs && !hasTables) {
      document.close();
      throw new IllegalArgumentException("文档为空: %s".formatted(path));
    }
    return document;
  } catch (IOException ex) {
    throw new UncheckedIOException("不能加载文档: %s".formatted(path), ex);
  }
}

HTML conversion options are configured with XHTMLOptions and an ImageManager so that extracted images are stored in an images sub‑directory next to the generated HTML:

public XHTMLOptions configureHtmlOptions(Path outputDir) {
  XHTMLOptions options = XHTMLOptions.create();
  options.setImageManager(new ImageManager(outputDir.toFile(), "images"));
  return options;
}

The actual conversion is performed by XHTMLConverter:

public void convertDocxToHtml(String docxPath) throws IOException {
  Path input = Paths.get(docxPath);
  String htmlFileName = input.getFileName().toString().replaceFirst("\\.[^.]+$", "") + ".html";
  Path output = input.resolveSibling(htmlFileName);
  try (XWPFDocument document = loadDocxFromPath(docxPath);
       OutputStream out = Files.newOutputStream(output)) {
    XHTMLConverter.getInstance().convert(document, out, configureHtmlOptions(output.getParent()));
  }
}

Spring Boot integration requires exposing the local directory as static resources and adding a controller that checks whether the HTML file already exists; if not, it triggers the conversion and then returns the filename for preview:

spring:
  mvc:
    static-path-pattern: /word/**
    view:
      suffix: .html
  web:
    resources:
      static-locations: file:///E:/wordtohtml
@Controller
@RequestMapping("/word")
public class WordToHtmlController {

  @Value("${pack.app.rootPath:E:/wordtohtml}")
  private String rootPath;

  @GetMapping("/preview")
  public String preview(String filename, HttpServletResponse response) throws Exception {
    response.setContentType("text/html;charset=UTF-8");
    Path htmlPath = Paths.get("%s/%s.html".formatted(rootPath, filename));
    if (!Files.exists(htmlPath)) {
      WordToHtml converter = new WordToHtml();
      Path docx = Paths.get("%s/%s.docx".formatted(rootPath, filename));
      converter.convertDocxToHtml(docx.toString());
    }
    return filename;
  }
}

Testing is done by placing a DOCX file (e.g., 技术架构.docx) under E:/wordtohtml and invoking the converter:

WordToHtml converter = new WordToHtml();
Path docx = Paths.get("E:/wordtohtml/技术架构.docx");
converter.convertDocxToHtml(docx.toString());

Running the preview endpoint generates an HTML file that preserves the original formatting and embeds images, which can be viewed directly in the browser. The article demonstrates the complete three‑step workflow—dependency setup, conversion logic, and Spring Boot integration—enabling seamless Word‑to‑HTML online preview.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaSpring BootApache POIDocument ConversionWord to HTMLXDocReport
Spring Full-Stack Practical Cases
Written by

Spring Full-Stack Practical Cases

Full-stack Java development with Vue 2/3 front-end suite; hands-on examples and source code analysis for Spring, Spring Boot 2/3, and Spring Cloud.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.