Tagged articles
23 articles
Page 1 of 1
Architect
Architect
Dec 29, 2023 · Industry Insights

How Bilibili Built a Scalable Anti‑Crawling System: Architecture, Data Flow, and Real‑World Impact

The article details Bilibili's comprehensive anti‑crawling solution, covering the problem background, a two‑layer detection framework integrated with APIGW and GAIA, risk perception, strategy iteration, verification mechanisms, quantitative results, and future improvement directions, all illustrated with concrete examples and performance numbers.

API SecurityBilibiliOperations
0 likes · 23 min read
How Bilibili Built a Scalable Anti‑Crawling System: Architecture, Data Flow, and Real‑World Impact
High Availability Architecture
High Availability Architecture
Dec 20, 2023 · Information Security

API Anti‑Crawling and Security Architecture: Risk Detection, Strategy, and Effectiveness at Bilibili

This article details Bilibili's comprehensive anti‑crawling system, covering the background of API abuse, the data‑flow framework, risk perception, strategy iteration, verification mechanisms, gateway signing design, and the measurable impact on normal and special‑case interfaces.

BilibiliRisk Detectionanti‑crawling
0 likes · 19 min read
API Anti‑Crawling and Security Architecture: Risk Detection, Strategy, and Effectiveness at Bilibili
Bilibili Tech
Bilibili Tech
Dec 19, 2023 · Information Security

API Anti-Crawling Architecture and Effectiveness at Bilibili

Bilibili combats API abuse by deploying a two‑layer anti‑crawling system—gateway‑side signature verification and a GAIA risk‑control engine integrated into APIGW—that unifies device data, applies flexible rule packages, triggers diverse human challenges, and has already blocked billions of malicious requests with over 85% recall while preventing service outages.

API SecurityBilibiliTraffic analysis
0 likes · 22 min read
API Anti-Crawling Architecture and Effectiveness at Bilibili
Architecture Digest
Architecture Digest
Sep 24, 2022 · Information Security

Web Crawling and Anti‑Crawling Techniques: Principles, Implementation, and Countermeasures

This article explains the technical principles and implementation steps of web crawlers, introduces common crawling frameworks, provides a Python example for extracting app store rankings, and then details various anti‑crawling methods such as CSS offset, image camouflage, custom fonts, dynamic rendering, captchas, request signing, and honeypots, followed by counter‑strategies for each.

PythonScrapyWeb Crawling
0 likes · 24 min read
Web Crawling and Anti‑Crawling Techniques: Principles, Implementation, and Countermeasures
vivo Internet Technology
vivo Internet Technology
Sep 14, 2022 · Information Security

Web Crawling, Anti‑Crawling, and Anti‑Anti‑Crawling Techniques: Principles, Frameworks, and Code Examples

The article explains web‑crawling basics, Python and Scrapy examples, then surveys common anti‑crawling defenses such as CSS offsets, image camouflage, custom fonts, dynamic rendering, captchas, request signatures and honeypots, and finally presents anti‑anti‑crawling countermeasures—including CSS‑offset reversal, font decoding, headless‑browser rendering and YOLOv5‑based captcha cracking, while stressing legal compliance.

CaptchaPythonScrapy
0 likes · 25 min read
Web Crawling, Anti‑Crawling, and Anti‑Anti‑Crawling Techniques: Principles, Frameworks, and Code Examples
Python Programming Learning Circle
Python Programming Learning Circle
Jun 3, 2020 · Information Security

Anti‑Crawling Techniques: Server‑Side and Client‑Side Detection Strategies

The article examines why web content needs protection, explains common server‑side header checks, describes client‑side JavaScript fingerprinting and headless‑browser detection methods, and outlines practical anti‑crawling measures such as CAPTCHAs and robots.txt, highlighting the ongoing cat‑and‑mouse game between crawlers and defenders.

CaptchaHTTP header inspectionWeb Crawling
0 likes · 12 min read
Anti‑Crawling Techniques: Server‑Side and Client‑Side Detection Strategies
58 Tech
58 Tech
May 13, 2020 · Information Security

Dynamic Signature Strategies for API Security: Attack and Defense Techniques

This article explores the cat‑and‑mouse battle between crawlers and API endpoints, detailing how dynamic signatures, token‑based authentication, time‑bound hashes, rate‑limiting, and code obfuscation can be used to defend against scraping while also showing how attackers can reverse‑engineer and bypass these defenses.

API Securityanti‑crawlingcode obfuscation
0 likes · 12 min read
Dynamic Signature Strategies for API Security: Attack and Defense Techniques
Python Programming Learning Circle
Python Programming Learning Circle
Feb 10, 2020 · Information Security

Common Anti‑Crawling Techniques and Countermeasures for Python Web Scrapers

The article outlines typical anti‑crawling measures such as browser detection, captchas, login requirements, JavaScript obfuscation, and behavior‑based blocks, and provides practical counter‑strategies including header spoofing, captcha solving, session/token handling, JS emulation, and human‑like request pacing.

AutomationJavaScriptSession
0 likes · 6 min read
Common Anti‑Crawling Techniques and Countermeasures for Python Web Scrapers
58 Tech
58 Tech
May 8, 2019 · Information Security

Overview of Web Crawling, Anti‑Crawling Techniques, and 58 Anti‑Crawling System

This article introduces the fundamentals of web crawlers, typical crawling methods, and a comprehensive set of anti‑crawling strategies—including IP control, browser and device simulation, CAPTCHA cracking, and traffic analysis—while detailing the architecture and capabilities of the 58 anti‑crawling platform.

Traffic analysisWeb Crawlinganti‑crawling
0 likes · 17 min read
Overview of Web Crawling, Anti‑Crawling Techniques, and 58 Anti‑Crawling System
Sohu Tech Products
Sohu Tech Products
Dec 5, 2018 · Backend Development

Overview of Web Crawler Types and the Architecture of the Mole Crawler System

This article explains the evolution and classification of web crawlers, describes the design and components of the Mole distributed crawler—including scheduler, fetcher, processor, rate‑limiting, URL deduplication, and Elasticsearch storage optimization—and outlines common anti‑anti‑crawling strategies.

ElasticsearchWeb Crawleranti‑crawling
0 likes · 12 min read
Overview of Web Crawler Types and the Architecture of the Mole Crawler System
Qunar Tech Salon
Qunar Tech Salon
Jul 27, 2018 · Information Security

Design and Features of an Anti‑Crawling Platform for Large‑Scale Services

The article describes the goals, architecture, core functions, and key characteristics of a comprehensive anti‑crawling platform that systematizes strategy management, data cleaning, monitoring, and rapid response to protect APIs and improve data reliability for large‑scale online services.

BackendSecurityanti‑crawling
0 likes · 10 min read
Design and Features of an Anti‑Crawling Platform for Large‑Scale Services
Qunar Tech Salon
Qunar Tech Salon
Jul 26, 2018 · Information Security

Understanding Anti‑Crawling: Definitions, Current Landscape, Classifications, and Strategic Insights

The article explains anti‑crawling concepts, current challenges, classification of techniques (client‑side, middle‑layer, server‑side, real‑time vs. non‑real‑time), and argues for a systematic, platform‑driven approach to continuously adapt strategies against evolving web scrapers.

Web Securityanti‑crawlingplatform
0 likes · 8 min read
Understanding Anti‑Crawling: Definitions, Current Landscape, Classifications, and Strategic Insights
Qunar Tech Salon
Qunar Tech Salon
Jul 25, 2018 · Information Security

Understanding Web Crawlers: Definitions, Types, Traffic, and Harm

This article introduces web crawlers, classifies them by technology and intent, presents statistics on crawler traffic across industries and regions, and analyzes the various harms they cause, laying the groundwork for future discussions on anti‑crawling strategies.

Traffic analysisWeb Crawlinganti‑crawling
0 likes · 10 min read
Understanding Web Crawlers: Definitions, Types, Traffic, and Harm
MaGe Linux Operations
MaGe Linux Operations
Dec 5, 2017 · Information Security

How to Defend Your Website Against Web Crawlers: Techniques & Tools

This article explores why web content needs protection, explains common server‑side and client‑side anti‑crawling methods—including User‑Agent checks, token cookies, headless‑browser detection, fingerprinting, captchas, and robots.txt—and offers practical guidance for raising the cost of unauthorized scraping.

Browser FingerprintingCaptchaHeadless Browser
0 likes · 12 min read
How to Defend Your Website Against Web Crawlers: Techniques & Tools
21CTO
21CTO
Jun 24, 2017 · Information Security

Why 95% of Web Traffic Is Bots: Inside the Crawling Arms Race

The article explores the hidden, high‑traffic world of web crawlers and anti‑crawling measures, revealing why most online requests are bots, how companies decide to crawl or block, the technical and organizational challenges involved, and what the future may hold for this perpetual cat‑and‑mouse game.

BackendWeb Crawlinganti‑crawling
0 likes · 22 min read
Why 95% of Web Traffic Is Bots: Inside the Crawling Arms Race
MaGe Linux Operations
MaGe Linux Operations
Jun 3, 2017 · Information Security

The Dark Side of Web Crawling: Industry Secrets, Technical Battles, and Future Trends

This article explores the hidden, often unglamorous world of web crawling and anti‑crawling, detailing why companies need these technologies, the massive traffic they generate, the technical arms race between crawlers and defenders, and the evolving strategies and challenges that shape the industry today.

Web Crawlinganti‑crawlinge‑commerce
0 likes · 21 min read
The Dark Side of Web Crawling: Industry Secrets, Technical Battles, and Future Trends
Ctrip Technology
Ctrip Technology
May 22, 2017 · Information Security

The Dark Side of Web Crawling and Anti‑Crawling: Industry Realities and Technical Strategies

This article examines the hidden, often unglamorous world of web crawling and anti‑crawling, revealing why companies deploy aggressive scraping and defensive measures, the technical arms race between crawlers and defenders, the impact on engineers' careers, and future trends in this contested space.

Web Crawlinganti‑crawlingdata-scraping
0 likes · 21 min read
The Dark Side of Web Crawling and Anti‑Crawling: Industry Realities and Technical Strategies
Ctrip Technology
Ctrip Technology
Jun 30, 2016 · Information Security

Anti‑Crawling Strategies and System Design: Insights from Ctrip Hotel R&D

This article shares practical anti‑crawling concepts, classifications of crawlers, design principles, traditional and JavaScript‑based countermeasures, and operational trade‑offs, illustrating how Ctrip's hotel R&D team balances commercial protection with technical feasibility.

BackendSystem DesignWeb Security
0 likes · 15 min read
Anti‑Crawling Strategies and System Design: Insights from Ctrip Hotel R&D
21CTO
21CTO
Mar 22, 2016 · Information Security

How to Outsmart AI-Powered Web Scrapers: Two Powerful Anti‑Crawling Tricks

Web crawlers, especially AI‑driven ones, threaten site performance and data ownership, so this article reviews common anti‑scraping methods—from IP and header analysis to behavior detection—and reveals two unconventional defenses: data poisoning and a deposit‑based access model that penalize malicious bots.

AIData ProtectionWeb Scraping
0 likes · 5 min read
How to Outsmart AI-Powered Web Scrapers: Two Powerful Anti‑Crawling Tricks