Comprehensive Survey and Implementation Guide for File Preview Solutions
This article presents an extensive survey of file preview options—including commercial services, front‑end libraries, and server‑side converters—detailing their advantages, limitations, implementation steps, and code examples for handling DOCX, PPTX, XLSX, and PDF formats in web applications.
When faced with the need to preview documents, the author investigated various solutions and categorized them into paid services, front‑end implementations, and back‑end conversions.
1. Commercial Preview Services
Options such as Microsoft Office Viewer, Google Drive Viewer, Alibaba Cloud IMM, XDOC, Office Web 365, and WPS Open Platform are listed, with notes on usage, limitations (e.g., file size limits, animation support), and pricing.
2. Front‑End Preview Solutions
PPTX Preview
The only found open‑source project is github.com/g21589/PPTX2HTML , which is outdated. Therefore, the author proposes parsing PPTX files directly using the Office OpenXML standard.
import JSZip from 'jszip'
// Load PPTX data
const zip = await JSZip.loadAsync(pptxData) const filesInfo = await getContentTypes(zip)
async function getContentTypes(zip: JSZip) {
const ContentTypesJson = await readXmlFile(zip, '[Content_Types].xml')
const subObj = ContentTypesJson['Types']['Override']
const slidesLocArray = []
const slideLayoutsLocArray = []
for (let i = 0; i < subObj.length; i++) {
switch (subObj[i]['attrs']['ContentType']) {
case 'application/vnd.openxmlformats-officedocument.presentationml.slide+xml':
slidesLocArray.push(subObj[i]['attrs']['PartName'].substr(1))
break
case 'application/vnd.openxmlformats-officedocument.presentationml.slideLayout+xml':
slideLayoutsLocArray.push(subObj[i]['attrs']['PartName'].substr(1))
break
default:
}
}
return { slides: slidesLocArray, slideLayouts: slideLayoutsLocArray }
}Further steps include parsing [Content_Types].xml , extracting slide information, loading themes, and rendering slides onto a canvas.
PDF Preview
Browsers can display PDFs directly via <iframe src="viewFileUrl"/> , but for a consistent UI the author recommends using PDF.js.
import * as pdfjs from 'pdfjs-dist'
import * as pdfjsWorker from 'pdfjs-dist/build/pdf.work.entry'
class PdfPreview {
private pdfDoc: PDFDocumentProxy | undefined
pageNumber: number = 1
total: number = 0
dom: HTMLElement
pdf: string | ArrayBuffer
constructor(pdf: string | ArrayBuffer, dom: HTMLElement | undefined) {
this.pdf = pdf
this.dom = dom ? dom : document.body
}
async pdfPreview() {
window.pdfjsLib.GlobalWorkerOptions.workerSrc = pdfjsWorker
const doc = await window.pdfjsLib.getDocument(this.pdf).promise
this.pdfDoc = doc
this.total = doc.numPages
for (let i = 1; i <= this.total; i++) {
await this.getPdfPage(i)
}
}
private async getPdfPage(number: number) {
return new Promise((resolve, reject) => {
if (this.pdfDoc) {
this.pdfDoc.getPage(number).then(page => {
const viewport = page.getViewport()
const canvas = document.createElement('canvas')
this.dom.appendChild(canvas)
const context = canvas.getContext('2d')
const [_, __, width, height] = viewport.viewBox
canvas.width = width
canvas.height = height
viewport.width = width
viewport.height = height
canvas.style.width = Math.floor(viewport.width) + 'px'
canvas.style.height = Math.floor(viewport.height) + 'px'
const renderContext = {
canvasContext: context,
viewport: viewport,
transform: [1, 0, 0, -1, 0, viewport.height]
}
page.render(renderContext)
resolve({ success: true, data: page })
})
} else {
reject({ success: false, data: null, message: 'pdfDoc is undefined' })
}
})
}
}DOCX Preview
The docx-preview npm package is used to render DOCX files to HTML.
import { renderAsync } from 'docx-preview'
export const renderDocx = async (options) => {
const { bodyContainer, styleContainer, buffer, docxOptions = {} } = options
const defaultOptions = { className: 'docx', ignoreLastRenderedPageBreak: false }
const configuration = Object.assign({}, defaultOptions, docxOptions)
if (bodyContainer) {
return renderAsync(buffer, bodyContainer, styleContainer, configuration)
} else {
const contain = document.createElement('div')
document.body.appendChild(contain)
return renderAsync(buffer, contain, styleContainer, configuration)
}
}XLSX Preview
The @vue-office/excel package provides Vue 2/3 components for rendering Excel files.
3. Server‑Side Preview Solutions
OpenOffice Conversion
Java code using JODConverter to start an OpenOffice service and convert documents to PDF.
package org.example;
import org.artofsolving.jodconverter.OfficeDocumentConverter;
import org.artofsolving.jodconverter.office.DefaultOfficeManagerConfiguration;
import org.artofsolving.jodconverter.office.OfficeManager;
import java.io.File;
public class OfficeUtil {
private static OfficeManager officeManager;
private static int[] port = {8100};
public static void startService() {
DefaultOfficeManagerConfiguration configuration = new DefaultOfficeManagerConfiguration();
try {
System.out.println("准备启动office转换服务....");
configuration.setOfficeHome("C:\\Program Files (x86)\\OpenOffice 4");
configuration.setPortNumbers(port);
configuration.setTaskExecutionTimeout(1000L * 60 * 30);
configuration.setTaskQueueTimeout(1000L * 60 * 60 * 24);
officeManager = configuration.buildOfficeManager();
officeManager.start();
System.out.println("office转换服务启动成功!");
} catch (Exception e) {
System.out.println("office转换服务启动失败!详细信息:" + e);
}
}
public static void stopService() {
System.out.println("准备关闭office转换服务....");
if (officeManager != null) {
officeManager.stop();
}
System.out.println("office转换服务关闭成功!");
}
public static void convertToPDF(String inputFile, String outputFile) {
startService();
System.out.println("进行文档转换:" + inputFile + " --> " + outputFile);
OfficeDocumentConverter converter = new OfficeDocumentConverter(officeManager);
converter.convert(new File(inputFile), new File(outputFile));
stopService();
}
public static void main(String[] args) {
convertToPDF("/Users/koolearn/Desktop/asdf.docx", "/Users/koolearn/Desktop/adsf.pdf");
}
}kkFileView
Instructions for building and running the Java‑based kkFileView service, including installing Java, Maven, LibreOffice, and starting the server.
brew install java
brew install mvn
export JAVA_HOME=$(/usr/libexec/java_home)
source .zshrc
brew install libreoffice
mvn clean install -DskipTestsAfter launching, the web UI allows uploading files for preview.
OnlyOffice
OnlyOffice provides both open‑source and enterprise editions for document preview and collaborative editing.
4. Summary
For public, non‑confidential files, Microsoft’s online viewer is recommended.
For high‑security, stable requirements with budget, Alibaba Cloud IMM is a viable option.
Server‑side converters (OpenOffice, kkFileView, OnlyOffice) offer the most complete preview capabilities.
If no budget or server is available, front‑end libraries enable zero‑cost client‑side rendering.
5. References
Links to documentation, GitHub repositories, npm packages, and articles are provided for further reading.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.