Master Web Automation with Puppeteer: PDFs, Testing, and Performance Tracing
This article introduces Puppeteer, a Node library for controlling Chrome/Chromium, explains its installation, key APIs, and demonstrates practical use cases such as generating PDFs, automating UI tests on mobile emulators, and capturing performance traces for web pages.
What Is Puppeteer?
Puppeteer is an official Google‑maintained Node library that provides a high‑level API to control Chrome or Chromium browsers, both with a visible UI and in headless mode. It enables developers to automate tasks that would normally require manual interaction with a browser.
Key Features
Capture web page snapshots and export them as PDFs or images.
Render Single‑Page Applications (SPA) and generate server‑side rendered (SSR) content.
Create automated test cases for form submission, UI interactions, keyboard input, etc.
Record site timelines to analyze performance.
Installation
Install Puppeteer like any other npm package: npm i --save puppeteer If the Chromium binary fails to download (common in restricted network environments), set the environment variable to skip the download and install Chromium manually: set PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=1 After downloading Chromium, configure the executable path in the launch options:
(async () => {
const browser = await puppeteer.launch({
executablePath: '/Applications/Chromium.app/Contents/MacOS/Chromium'
});
})();Common API Syntax
puppeteer.launch()– Starts a browser instance (returns a Promise). browser.newPage() – Opens a new tab or window (returns a Promise).
page.setViewport({width, height, deviceScaleFactor, isMobile, hasTouch, isLandscape})– Configures viewport dimensions. page.goto(url, options) – Navigates to a URL (must start with https). page.waitFor(timeout) – Pauses execution for the specified milliseconds. page.waitForSelector(selector) – Waits until a DOM element appears. page.$(selector) – Returns the first element matching the selector (or null). page.$$(selector) – Returns an array of all matching elements. page.$eval(selector, pageFunction, ...args) – Executes a function on the first matched element. page.$$eval(selector, pageFunction, ...args) – Executes a function on an array of matched elements. browser.close() – Closes all pages and terminates Chromium.
Example 1: Capture Webpage as PDF
The script below visits the ECMAScript 6 tutorial site, extracts navigation links, and saves each linked page as a separate PDF file.
const URL = 'http://es6.ruanyifeng.com';
const puppeteer = require('puppeteer');
const fs = require('fs');
fs.mkdirSync('es6-pdf');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(URL);
await page.waitFor(5000);
const aTags = await page.evaluate(() => {
const elems = [...document.querySelectorAll('#sidebar ol li a')];
return elems.map(a => ({ href: a.href.trim(), name: a.text }));
});
await page.pdf({ path: `./es6-pdf/0.${aTags[0].name}.pdf` });
for (let i = 1; i < aTags.length; i++) {
const a = aTags[i];
console.log('Saving:', a.name);
const newPage = await browser.newPage();
await newPage.goto(a.href);
await newPage.waitFor(5000);
await newPage.pdf({ path: `./es6-pdf/${i}.${a.name}.pdf` });
await newPage.close();
}
await browser.close();
})();Running the script produces a series of PDF files, one for each section of the tutorial.
Example 2: Automated UI Testing on Mobile
This example demonstrates how to emulate an iPhone X, navigate to an automotive website, interact with UI elements, and capture screenshots at each step.
const URL = 'https://m.autohome.com.cn/';
const puppeteer = require('puppeteer');
const devices = require('puppeteer/DeviceDescriptors');
const iPhone = devices['iPhone X'];
const fs = require('fs');
fs.mkdirSync('screenshot');
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.emulate(iPhone);
await page.goto(URL);
await page.waitFor(3000);
await page.screenshot({ path: 'screenshot/1.png' });
await page.tap('body > section.wrapper > ...'); // select a car
await page.waitFor(3000);
await page.screenshot({ path: 'screenshot/2.png' });
// Fill inquiry form
await page.type('#userName', '测试人员', { delay: 200 });
await page.type('#userPhone', '13333333333', { delay: 200 });
await page.screenshot({ path: 'screenshot/5.png' });
await browser.close();
})();The script records each interaction as a screenshot, providing a visual audit trail for the test flow.
Example 3: Performance Tracing
To analyze page load performance on a mobile device, the script starts a Chrome tracing session, navigates to the target URL, stops tracing, and saves the trace file for inspection in Chrome DevTools.
const URL = 'https://m.autohome.com.cn/';
const puppeteer = require('puppeteer');
const devices = require('puppeteer/DeviceDescriptors');
const iPhone = devices['iPhone X'];
const fs = require('fs');
fs.mkdirSync('performance');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.emulate(iPhone);
await page.tracing.start({ path: 'performance/trace.json' });
await page.goto(URL);
await page.tracing.stop();
await browser.close();
})();Open performance/trace.json in Chrome’s Performance panel to view a flame chart of resource loading, scripting, and rendering phases.
Conclusion
Puppeteer offers a versatile set of APIs for web scraping, UI testing, PDF generation, and performance analysis. The examples above cover only a fraction of its capabilities; developers are encouraged to explore the official repository for advanced scenarios such as network interception, custom authentication flows, and parallel page handling.
References
Wikipedia – Web crawler
Puppeteer official GitHub repository
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
