Information Security 19 min read

API Anti‑Crawling and Security Architecture: Risk Detection, Strategy, and Effectiveness at Bilibili

This article details Bilibili's comprehensive anti‑crawling system, covering the background of API abuse, the data‑flow framework, risk perception, strategy iteration, verification mechanisms, gateway signing design, and the measurable impact on normal and special‑case interfaces.

High Availability Architecture

Dec 20, 2023

API Anti‑Crawling and Security Architecture: Risk Detection, Strategy, and Effectiveness at Bilibili

1. Background of API anti‑crawling – API abuse threatens platform resources, user privacy, and business operations. Bilibili identifies vulnerable interfaces such as video info, user info, comments, live‑stream messages, and activity data.

2. Anti‑crawling data‑flow framework – Traffic passes through APIGW where a signature verification component blocks obvious malicious calls, then flows to the GAIA risk engine for feature‑based anomaly detection, prompting front‑end verification (CAPTCHA, login prompts) when needed.

2.1 Data integration with risk engine – Initially, each service reported data individually, which was labor‑intensive. Integrating risk reporting into APIGW enabled unified, code‑free onboarding, raising efficiency from 2‑5 interfaces per week to over 10 per day.

2.2 Risk perception and strategy iteration – Near‑real‑time monitoring detects traffic spikes and feature anomalies (e.g., abnormal UA or device ratios). Strategies include frequency limits, abnormal aggregation, and parameter‑value checks, with reusable rule groups applied automatically to new interfaces.

2.3 Abnormal traffic handling – Multiple mitigation methods (toast rejection, data poisoning, various CAPTCHA types, SMS, login dialogs) are deployed based on risk level, achieving ~99% coverage.

2.4 Gateway signature component – A mixed‑key signing scheme encrypts request parameters; the gateway validates signatures, reports to risk engine, and can block suspicious calls. The architecture comprises a signing SDK, web gateway, business gateway, risk platform, and front‑end.

2.4.1 Signing process – Key generation, distribution, obfuscation, signature construction, and gateway verification occur in five steps, ensuring only legitimate requests pass.

3. Effectiveness of anti‑crawling – Quantitative metrics show billions of abnormal requests blocked daily, recall rates above 85%, and no service outages due to crawlers in Q3 2023. Special interfaces (live‑stream connections, gold‑seed exchanges, follow actions) also saw significant reductions in malicious activity.

4. Summary and future outlook – The project delivered fast onboarding, timely risk perception, layered mitigation, and reproducible results. Future work includes lightweight engine deployment, advanced crawler behavior modeling, and AI‑driven risk identification.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

gateway risk mitigation Bilibili anti‑crawling Risk Detection Verification

Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.