Tagged articles

anti‑crawling

23 articles · Page 1 of 1

Jul 1, 2025 · Operations

How to Scrape Weibo Data with Python: Complete Guide & Code

This tutorial walks through using Python to crawl Weibo, covering environment setup, three login methods, data extraction functions for user info, posts and comments, anti‑crawling strategies, storage to CSV or MySQL, a full example script, and legal considerations.

SeleniumWeb ScrapingWeibo

0 likes · 12 min read

How to Scrape Weibo Data with Python: Complete Guide & Code

Architect

Dec 29, 2023 · Industry Insights

How Bilibili Built a Scalable Anti‑Crawling System: Architecture, Data Flow, and Real‑World Impact

The article details Bilibili's comprehensive anti‑crawling solution, covering the problem background, a two‑layer detection framework integrated with APIGW and GAIA, risk perception, strategy iteration, verification mechanisms, quantitative results, and future improvement directions, all illustrated with concrete examples and performance numbers.

API SecurityBilibiliOperations

0 likes · 23 min read

How Bilibili Built a Scalable Anti‑Crawling System: Architecture, Data Flow, and Real‑World Impact

High Availability Architecture

Dec 20, 2023 · Information Security

API Anti‑Crawling and Security Architecture: Risk Detection, Strategy, and Effectiveness at Bilibili

This article details Bilibili's comprehensive anti‑crawling system, covering the background of API abuse, the data‑flow framework, risk perception, strategy iteration, verification mechanisms, gateway signing design, and the measurable impact on normal and special‑case interfaces.

BilibiliRisk DetectionVerification

0 likes · 19 min read

API Anti‑Crawling and Security Architecture: Risk Detection, Strategy, and Effectiveness at Bilibili

Bilibili Tech

Dec 19, 2023 · Information Security

API Anti-Crawling Architecture and Effectiveness at Bilibili

Bilibili combats API abuse by deploying a two‑layer anti‑crawling system—gateway‑side signature verification and a GAIA risk‑control engine integrated into APIGW—that unifies device data, applies flexible rule packages, triggers diverse human challenges, and has already blocked billions of malicious requests with over 85% recall while preventing service outages.

API SecurityBilibiliTraffic analysis

0 likes · 22 min read

API Anti-Crawling Architecture and Effectiveness at Bilibili

Architecture Digest

Sep 24, 2022 · Information Security

Web Crawling and Anti‑Crawling Techniques: Principles, Implementation, and Countermeasures

This article explains the technical principles and implementation steps of web crawlers, introduces common crawling frameworks, provides a Python example for extracting app store rankings, and then details various anti‑crawling methods such as CSS offset, image camouflage, custom fonts, dynamic rendering, captchas, request signing, and honeypots, followed by counter‑strategies for each.

PythonScrapyanti‑crawling

0 likes · 24 min read

Web Crawling and Anti‑Crawling Techniques: Principles, Implementation, and Countermeasures

vivo Internet Technology

Sep 14, 2022 · Information Security

Web Crawling, Anti‑Crawling, and Anti‑Anti‑Crawling Techniques: Principles, Frameworks, and Code Examples

The article explains web‑crawling basics, Python and Scrapy examples, then surveys common anti‑crawling defenses such as CSS offsets, image camouflage, custom fonts, dynamic rendering, captchas, request signatures and honeypots, and finally presents anti‑anti‑crawling countermeasures—including CSS‑offset reversal, font decoding, headless‑browser rendering and YOLOv5‑based captcha cracking, while stressing legal compliance.

PythonScrapyanti‑crawling

0 likes · 25 min read

Web Crawling, Anti‑Crawling, and Anti‑Anti‑Crawling Techniques: Principles, Frameworks, and Code Examples

Python Programming Learning Circle

Apr 8, 2021 · Information Security

Multiple Anti‑Crawling Measures and Best Practices for Web Scraping

The article outlines several anti‑crawling techniques—including IP restrictions, User‑Agent validation, CAPTCHAs, AJAX loading, noscript tags, and cookie checks—while also offering practical advice for writing ethical, efficient, and robust web crawlers.

IP blockingScrapingUser-Agent

0 likes · 6 min read

Multiple Anti‑Crawling Measures and Best Practices for Web Scraping

Python Programming Learning Circle

Jun 3, 2020 · Information Security

Anti‑Crawling Techniques: Server‑Side and Client‑Side Detection Strategies

The article examines why web content needs protection, explains common server‑side header checks, describes client‑side JavaScript fingerprinting and headless‑browser detection methods, and outlines practical anti‑crawling measures such as CAPTCHAs and robots.txt, highlighting the ongoing cat‑and‑mouse game between crawlers and defenders.

HTTP header inspectionanti‑crawlingcaptcha

0 likes · 12 min read

Anti‑Crawling Techniques: Server‑Side and Client‑Side Detection Strategies

58 Tech

May 13, 2020 · Information Security

Dynamic Signature Strategies for API Security: Attack and Defense Techniques

This article explores the cat‑and‑mouse battle between crawlers and API endpoints, detailing how dynamic signatures, token‑based authentication, time‑bound hashes, rate‑limiting, and code obfuscation can be used to defend against scraping while also showing how attackers can reverse‑engineer and bypass these defenses.

API Securityanti‑crawlingcode-obfuscation

0 likes · 12 min read

Dynamic Signature Strategies for API Security: Attack and Defense Techniques

Python Programming Learning Circle

Feb 10, 2020 · Information Security

Common Anti‑Crawling Techniques and Countermeasures for Python Web Scrapers

The article outlines typical anti‑crawling measures such as browser detection, captchas, login requirements, JavaScript obfuscation, and behavior‑based blocks, and provides practical counter‑strategies including header spoofing, captcha solving, session/token handling, JS emulation, and human‑like request pacing.

AutomationJavaScriptSession

0 likes · 6 min read

Common Anti‑Crawling Techniques and Countermeasures for Python Web Scrapers

58 Tech

May 8, 2019 · Information Security

Overview of Web Crawling, Anti‑Crawling Techniques, and 58 Anti‑Crawling System

This article introduces the fundamentals of web crawlers, typical crawling methods, and a comprehensive set of anti‑crawling strategies—including IP control, browser and device simulation, CAPTCHA cracking, and traffic analysis—while detailing the architecture and capabilities of the 58 anti‑crawling platform.

Traffic analysisanti‑crawlingbot detection

0 likes · 17 min read

Overview of Web Crawling, Anti‑Crawling Techniques, and 58 Anti‑Crawling System

Sohu Tech Products

Dec 5, 2018 · Backend Development

Overview of Web Crawler Types and the Architecture of the Mole Crawler System

This article explains the evolution and classification of web crawlers, describes the design and components of the Mole distributed crawler—including scheduler, fetcher, processor, rate‑limiting, URL deduplication, and Elasticsearch storage optimization—and outlines common anti‑anti‑crawling strategies.

ElasticsearchWeb Crawleranti‑crawling

0 likes · 12 min read

Overview of Web Crawler Types and the Architecture of the Mole Crawler System

Qunar Tech Salon

Jul 27, 2018 · Information Security

Design and Features of an Anti‑Crawling Platform for Large‑Scale Services

The article describes the goals, architecture, core functions, and key characteristics of a comprehensive anti‑crawling platform that systematizes strategy management, data cleaning, monitoring, and rapid response to protect APIs and improve data reliability for large‑scale online services.

Platformanti‑crawlingbackend

0 likes · 10 min read

Design and Features of an Anti‑Crawling Platform for Large‑Scale Services

Qunar Tech Salon

Jul 26, 2018 · Information Security

Understanding Anti‑Crawling: Definitions, Current Landscape, Classifications, and Strategic Insights

The article explains anti‑crawling concepts, current challenges, classification of techniques (client‑side, middle‑layer, server‑side, real‑time vs. non‑real‑time), and argues for a systematic, platform‑driven approach to continuously adapt strategies against evolving web scrapers.

PlatformStrategyanti‑crawling

0 likes · 8 min read

Understanding Anti‑Crawling: Definitions, Current Landscape, Classifications, and Strategic Insights

Qunar Tech Salon

Jul 25, 2018 · Information Security

Understanding Web Crawlers: Definitions, Types, Traffic, and Harm

This article introduces web crawlers, classifies them by technology and intent, presents statistics on crawler traffic across industries and regions, and analyzes the various harms they cause, laying the groundwork for future discussions on anti‑crawling strategies.

Traffic analysisanti‑crawlingcrawler classification

0 likes · 10 min read

Understanding Web Crawlers: Definitions, Types, Traffic, and Harm

MaGe Linux Operations

Dec 5, 2017 · Information Security

How to Defend Your Website Against Web Crawlers: Techniques & Tools

This article explores why web content needs protection, explains common server‑side and client‑side anti‑crawling methods—including User‑Agent checks, token cookies, headless‑browser detection, fingerprinting, captchas, and robots.txt—and offers practical guidance for raising the cost of unauthorized scraping.

Browser FingerprintingHeadless Browseranti‑crawling

0 likes · 12 min read

How to Defend Your Website Against Web Crawlers: Techniques & Tools

Tencent IMWeb Frontend Team

Jul 13, 2017 · Frontend Development

Creative Front‑End Anti‑Crawling Tricks Every Developer Should Know

This article explores a variety of front‑end anti‑crawling techniques—from font‑face obfuscation and background‑image sprites to pseudo‑elements and iframe loading—illustrating how developers can make data extraction harder for bots while acknowledging that no method is foolproof.

JavaScriptanti‑crawlingfrontend

0 likes · 6 min read

Creative Front‑End Anti‑Crawling Tricks Every Developer Should Know

21CTO

Jun 24, 2017 · Information Security

Why 95% of Web Traffic Is Bots: Inside the Crawling Arms Race

The article explores the hidden, high‑traffic world of web crawlers and anti‑crawling measures, revealing why most online requests are bots, how companies decide to crawl or block, the technical and organizational challenges involved, and what the future may hold for this perpetual cat‑and‑mouse game.

Industryanti‑crawlingbackend

0 likes · 22 min read

Why 95% of Web Traffic Is Bots: Inside the Crawling Arms Race

Qunar Tech Salon

Jun 22, 2017 · Information Security

The Dark Side of Web Crawling and Anti‑Crawling: Industry Realities and Technical Challenges

This article explores the often hidden and contentious world of web crawling and anti‑crawling, detailing industry motivations, the massive proportion of bot traffic, the technical arms race between scrapers and defenders, and the broader impact on developers, companies, and security practices.

JavaScriptPythonanti‑crawling

0 likes · 21 min read

The Dark Side of Web Crawling and Anti‑Crawling: Industry Realities and Technical Challenges

MaGe Linux Operations

Jun 3, 2017 · Information Security

The Dark Side of Web Crawling: Industry Secrets, Technical Battles, and Future Trends

This article explores the hidden, often unglamorous world of web crawling and anti‑crawling, detailing why companies need these technologies, the massive traffic they generate, the technical arms race between crawlers and defenders, and the evolving strategies and challenges that shape the industry today.

anti‑crawlinge-commerceinformation security

0 likes · 21 min read

The Dark Side of Web Crawling: Industry Secrets, Technical Battles, and Future Trends

Ctrip Technology

May 22, 2017 · Information Security

The Dark Side of Web Crawling and Anti‑Crawling: Industry Realities and Technical Strategies

This article examines the hidden, often unglamorous world of web crawling and anti‑crawling, revealing why companies deploy aggressive scraping and defensive measures, the technical arms race between crawlers and defenders, the impact on engineers' careers, and future trends in this contested space.

Data Scrapinganti‑crawlinginformation security

0 likes · 21 min read

The Dark Side of Web Crawling and Anti‑Crawling: Industry Realities and Technical Strategies

Ctrip Technology

Jun 30, 2016 · Information Security

Anti‑Crawling Strategies and System Design: Insights from Ctrip Hotel R&D

This article shares practical anti‑crawling concepts, classifications of crawlers, design principles, traditional and JavaScript‑based countermeasures, and operational trade‑offs, illustrating how Ctrip's hotel R&D team balances commercial protection with technical feasibility.

System DesignTraffic Managementanti‑crawling

0 likes · 15 min read

Anti‑Crawling Strategies and System Design: Insights from Ctrip Hotel R&D

21CTO

Mar 22, 2016 · Information Security

How to Outsmart AI-Powered Web Scrapers: Two Powerful Anti‑Crawling Tricks

Web crawlers, especially AI‑driven ones, threaten site performance and data ownership, so this article reviews common anti‑scraping methods—from IP and header analysis to behavior detection—and reveals two unconventional defenses: data poisoning and a deposit‑based access model that penalize malicious bots.

AIData ProtectionWeb Scraping

0 likes · 5 min read

How to Outsmart AI-Powered Web Scrapers: Two Powerful Anti‑Crawling Tricks