Frontend Development 6 min read

Creative Front‑End Anti‑Crawling Tricks Every Developer Should Know

This article explores a variety of front‑end anti‑crawling techniques—from font‑face obfuscation and background‑image sprites to pseudo‑elements and iframe loading—illustrating how developers can make data extraction harder for bots while acknowledging that no method is foolproof.

Tencent IMWeb Frontend Team

Jul 13, 2017

Creative Front‑End Anti‑Crawling Tricks Every Developer Should Know

1. Introduction

For a web page we want good structure and clear content for search engines, but sometimes we need to hide data such as e‑commerce revenue or exam questions, which leads to the topic of crawlers and anti‑crawlers.

2. Common Anti‑Crawler Strategies

There is no perfect solution. Most backend‑oriented methods try to distinguish humans from bots, such as:

User‑Agent and Referer checks

Account and Cookie verification

CAPTCHA

IP rate limiting

Crawlers can mimic humans using headless browsers, OCR for CAPTCHAs, or purchased proxy IPs, so 100% protection is impossible.

3. Front‑End Anti‑Crawler Techniques

3.1 Font‑Face Obfuscation

Example: Maoyan movies – numeric data is rendered via a custom font and Unicode mapping, requiring the crawler to download and decode the font. The font URL changes on each refresh, increasing difficulty.

3.2 Background‑Image Sprites

Example: Meituan – numbers are displayed as background‑positioned images, with different offsets for each digit.

3.3 Hidden Characters

Some public‑account articles insert random characters and hide them with CSS, making simple text extraction harder.

3.4 Pseudo‑Element Content

Example: Autohome – critical information is placed in CSS pseudo‑elements, forcing the crawler to parse CSS to retrieve it.

3.5 Element Position Overlap

Example: Qunar – the price is built from several i tags, then two b tags are absolutely positioned to cover them with false values; the correct price appears only after visual rendering.

3.6 Iframe Asynchronous Loading

Example: NetEase Cloud Music – the initial HTML contains only an empty iframe (src="about:blank"); JavaScript later injects the full page into the iframe, requiring the crawler to execute scripts or intercept the network request.

3.7 Split‑Node Digits

Example: Proxy‑IP listings – the IP address is split into separate DOM nodes with decoy numbers inserted between them, confusing simple scrapers.

3.8 Character‑Set Replacement

Example: Qunar mobile – the HTML contains "3211" but CSS redefines the character set so the visual order becomes "1233", swapping digits to mislead crawlers.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

frontend JavaScript anti‑crawling scraping protection

Written by

Tencent IMWeb Frontend Team

IMWeb Frontend Community gathering frontend development enthusiasts. Follow us for refined live courses by top experts, cutting‑edge technical posts, and to sharpen your frontend skills.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.