How to Generate Paginated PDFs from HTML with html2canvas and jsPDF

This article explains how to convert HTML content—including text, images, and tables—into PDF files using html2canvas and jsPDF, and details a dynamic pagination technique to prevent content truncation across pages, while also covering performance and export considerations.

Goodme Frontend Team
Goodme Frontend Team
Goodme Frontend Team
How to Generate Paginated PDFs from HTML with html2canvas and jsPDF

Introduction

As contracts become essential for protecting the rights of partners, investors, and suppliers, generating PDF contracts online can greatly reduce communication and time costs.

Typical contract documents contain headers, footers, and multiple pages, requiring proper pagination of the generated PDF.

Requirement Analysis

The PDF generation scenario can be divided into two main steps:

Convert the required text, images, tables, etc., into a PDF.

For lengthy content, calculate pagination points to avoid content being split across pages.

We will start with PDF generation.

PDF Generation

Solution Selection

Two mainstream solutions exist:

Use only the built‑in jsPDF API.

Combine html2canvas with jsPDF.

The article adopts the second solution because it offers greater flexibility, better CSS support, easier handling of complex layouts, cross‑browser compatibility, and improved performance.

Advantages of html2canvas + jsPDF

Flexibility & Compatibility

html2canvas can render any HTML element (text, images, CSS) to a canvas, preserving visual fidelity.

jsPDF alone has limited CSS handling.

Complex Layout Handling

Canvas rendering allows precise control of multi‑column layouts, tables, and images.

Pagination becomes straightforward by slicing the canvas into separate images for each PDF page.

Cross‑Browser Support Both libraries rely on the Canvas API, but html2canvas extends support for complex styles across browsers.

Performance Rendering to canvas first reduces the workload on jsPDF and speeds up PDF creation.

Below is a simplified implementation:

Declare a ref to capture the element.

const contentRef = useRef(null);
... 
<div ref={contentRef} className="pdf-reviewer"> // PDF HTML content area
  ...
</div>

Render the element with html2canvas to obtain canvas data.

import html2canvas from 'html2canvas';
const canvas = await html2canvas(element, {
  allowTaint: true,
  scale: window.devicePixelRatio * 2,
  useCORS: true,
  windowHeight: element.scrollHeight,
});
const canvasWidth = canvas.width;
const canvasHeight = canvas.height;
const canvasData = canvas.toDataURL('image/jpeg', 1.0);
const context = canvas.getContext('2d');
context.clearRect(0, 0, canvasWidth, canvasHeight);

Draw the canvas image onto the PDF using jsPDF.

const pdf = new jsPDF({ unit: 'pt', format: 'a4', orientation: 'p' });
pdf.addImage(canvas, 'JPEG', x, y, width, height);

Open and preview the PDF.

const pdfBlob = pdf.output('blob');
const pdfUrl = URL.createObjectURL(pdfBlob);
window.open(pdfUrl);

After this basic step, a truncation issue may appear where text is cut off at the page break.

The text is truncated at the seam between two PDF pages, which is unacceptable.

Thinking

The cause is that html2canvas creates a single canvas image; when the image is placed into the PDF, the PDF splits it according to page height.

Therefore, we must adjust the content before inserting the canvas so that elements avoid the pagination point.

PDF Pagination

Solution Selection

Manual Pagination Adjust document styles manually to avoid page breaks. Suitable for static content but hard to maintain.

Dynamic Pagination Calculate pagination points based on HTML content and PDF page size, ensuring elements never cross a page boundary. This approach requires minimal maintenance.

The article proceeds with the dynamic pagination solution.

Dynamic Pagination

Understanding the PDF generation flow is essential for calculating heights and positions.

PDF Generation Process

Obtain the root DOM element.

Use html2canvas to render the element to a canvas and get the total height.

Calculate header and footer heights.

Determine the actual content height per page after subtracting header/footer and spacing.

Iterate over element nodes, using the per‑page height to compute total pages and pagination points, storing them in a collection.

For each pagination point, slice the canvas data and add the image to the PDF.

Add headers and footers.

The difficulty lies in distributing content across pages without truncation, which varies by element type.

Pagination Point

By using the actual content height per page and the total height, we can compute pagination points where the canvas should be split.

Ordinary Elements

If the element’s top offset plus its height exceeds the page’s usable height, the element’s top becomes a pagination point.

Tables

Table rows often have specific class names (e.g., "ant-table-row"). When such a class is detected, pagination is performed at the row level rather than descending into child nodes.

Text

Text nodes require deeper handling because a single text element may span multiple lines. The algorithm checks the node type, obtains the parent’s height and line height, calculates the distance from the current page bottom, and determines the pagination point accordingly.

if (one.nodeType === 3) {
  const { offsetHeight } = one.parentNode;
  const offsetTop = getBaseElementTop(one.parentNode);
  const top = Math.max(0, rate * offsetTop);
  const lineHeightString = window.getComputedStyle(one.parentNode).lineHeight;
  const lineHeightMatch = lineHeightString.match(/\d+(\.\d+)?/);
  const lineHeightValue = lineHeightMatch ? parseFloat(lineHeightMatch[0]) : 0;
  const lineHeight = lineHeightValue * rate;
  const elementHeight = rate * offsetHeight;
  const previousPoint = pages.length > 0 ? pages[pages.length - 1] : 0;
  if (top + elementHeight - previousPoint > originalPageHeight) {
    const currentRemainHeight = previousPoint + originalPageHeight - top;
    const remainder = currentRemainHeight % lineHeight;
    pages.push(previousPoint + originalPageHeight - remainder);
  }
}

Manual Marking of Pagination Points

In some cases you may want to force a new page. Assign a specific class name to the element where a page break should occur; the algorithm will treat the element’s top as a pagination point.

Additional Considerations

Export Issues

When the PDF content is large, using a data URL may cause loading failures. Convert the PDF to a Blob before opening.

const pdfBlob = obj.getPDF().output('blob');
const pdfUrl = URL.createObjectURL(pdfBlob);
window.open(pdfUrl);
const blob = dataURLtoBlob(obj.getPDF().output('datauristring'));
const pdfUrl = URL.createObjectURL(blob);
window.open(pdfUrl);
// Convert base64 to Blob
const dataURLtoBlob = (dataurl) => {
  const arr = dataurl.split(',');
  const _arr = arr[1].substring(0, arr[1].length - 2);
  const mime = arr[0].match(/:(.*?);/)[1];
  const bstr = atob(_arr);
  let n = bstr.length;
  const u8arr = new Uint8Array(n);
  while (n--) { u8arr[n] = bstr.charCodeAt(n); }
  return new Blob([u8arr], { type: mime });
};

Content Too Long

Browsers impose canvas size limits (e.g., Chrome 16384×16384, Firefox ~11164×11164, Safari 4096×4096 on iOS, etc.). Exceeding these limits can cause blank pages or crashes. The solution is to split a large canvas into smaller segments before rendering to PDF.

async function toCanvasAll(element, width) {
  const canvas = await html2canvas(element, {
    allowTaint: true,
    scale: window.devicePixelRatio * 2,
    useCORS: true,
    windowHeight: element.scrollHeight,
  });
  const canvasWidth = canvas.width;
  const canvasHeight = canvas.height;
  const rate = width / canvasWidth;
  const height = rate * canvasHeight;
  const canvasData = canvas.toDataURL('image/jpeg', 1.0);
  const context = canvas.getContext('2d');
  context.clearRect(0, 0, canvasWidth, canvasHeight);
  if (canvasData === 'data:,') {
    const canvasDataArr = await toCanvasSplit(element, rate);
    return { totalHeight: height, data: canvasDataArr.sort((a, b) => a.index - b.index) };
  }
  return { totalHeight: height, data: [{ width, height, index: 0, data: canvasData, start: 0, end: height }] };
}

async function toCanvasSplit(element, rate, parts = 2) {
  const yOffsets = distributeEvenlySimple(element.scrollHeight, parts);
  let res;
  try {
    const arr = [];
    for (let index = 0; index < yOffsets.length; index++) {
      const previous = yOffsets[index - 1] || 0;
      const canvas = await html2canvas(element, {
        allowTaint: true,
        scale: window.devicePixelRatio * 2,
        useCORS: true,
        y: previous,
        height: yOffsets[index] - previous,
      });
      const width = rate * canvas.width;
      const height = rate * canvas.height;
      const canvasData = canvas.toDataURL('image/jpeg', 1.0);
      if (canvasData === 'data:,') { throw new Error('canvasData is empty'); }
      const context = canvas.getContext('2d');
      context.clearRect(0, 0, canvas.width, canvas.height);
      const start = arr[index - 1]?.end || 0;
      arr.push({ width, height, index, data: canvasData, start, end: start + height });
      res = arr;
    }
  } catch (e) {
    console.warn('error', e);
    res = await toCanvasSplit(element, rate, parts + 1);
  }
  return res;
}

Pending Optimizations

Long Generation Time The footer is rendered repeatedly for each page because page numbers are dynamic, causing high latency. A possible improvement is to render all footers once, slice them, and reuse the images.

Hiding HTML Elements During Generation Making the source elements invisible can prevent rendering. Instead, set the element’s visibility to hidden and opacity to 0 while keeping it in the layout, or move it off‑screen with a large negative margin, ensuring the page height remains unchanged.

Conclusion

This exploration demonstrates how to generate PDFs from HTML with dynamic pagination, handling complex layouts, performance constraints, and browser limitations. The techniques can be extended to more sophisticated business scenarios, encouraging further discussion and improvement.

frontendJavaScriptPaginationhtml2canvaspdf-generationjspdf
Goodme Frontend Team
Written by

Goodme Frontend Team

Regularly sharing the team's insights and expertise in the frontend field

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.