Implementing a Spring Boot Anti‑Crawler Filter with kk‑anti‑reptile

This article explains how to integrate the kk‑anti‑reptile anti‑crawler component into a Spring Boot application, covering system requirements, filter workflow, rule configuration, Maven setup, Redis and Apollo settings, as well as front‑end handling of the 509 response and captcha verification.

Programmer DD
Programmer DD
Programmer DD
Implementing a Spring Boot Anti‑Crawler Filter with kk‑anti‑reptile

System Requirements

Based on Spring Boot (both 1.x and 2.x are supported)

Requires Redis

Workflow

kk‑anti‑reptile registers a Filter that follows the Servlet specification. The filter is instantiated via Spring Boot’s extension points and injected into FilterRegistrationBean, which then becomes part of the Servlet container.

Inside the filter, a responsibility‑chain pattern weaves various filtering rules. When a request fails a rule, the filter returns HTTP status 509 and serves a captcha page; after the correct captcha is entered, the rule chain is reset.

IP Rule

The IP rule counts requests within a time window; if the count exceeds the configured maximum, the request is blocked. The window length, maximum request count, and IP whitelist are configurable.

User‑Agent Rule

The UA rule inspects the User‑Agent header to extract OS, device, and browser information, allowing configurable filtering based on these dimensions.

Captcha Handling

When a request is blocked, a captcha (Chinese characters, alphanumeric, or simple arithmetic) is generated in either static image or GIF format, making automated solving extremely difficult.

Integration Guide

Add the Maven dependency:

<dependency>
    <groupId>cn.keking.project</groupId>
    <artifactId>kk-anti-reptile</artifactId>
    <version>1.0.0-SNAPSHOT</version>
</dependency>

Enable the component in application.properties (or bootstrap.properties when using Apollo): anti.reptile.manager.enabled=true Front‑end code must intercept HTTP 509 responses, open a new window with the captcha HTML, and inject the backend baseUrl parameter. Example using Axios:

import axios from 'axios';
import {baseUrl} from './config';

axios.interceptors.response.use(
  data => data,
  error => {
    if (error.response.status === 509) {
      let html = error.response.data;
      let verifyWindow = window.open('', '_blank', 'height=400,width=560');
      verifyWindow.document.write(html);
      verifyWindow.document.getElementById('baseUrl').value = baseUrl;
    }
    return Promise.reject(error);
  }
);

export default axios;

Important Notes

Apollo client must have bootstrap enabled; add apollo.bootstrap.enabled=true to the configuration.

Redisson connection is required. If Redisson is already used, the component will auto‑detect the client; otherwise, provide Redis connection settings, e.g.:

spring.redisson.address=redis://192.168.1.204:6379
spring.redisson.password=xxx

Configuration Overview

All settings are prefixed with anti.reptile.manager. Key options include:

enabled : enable or disable the anti‑crawler plugin (default true)

include-urls : comma‑separated list of URLs to protect

ip-rule.enabled , ip-rule.expiration-time , ip-rule.request-max-size , ip-rule.ignore-ip

ua-rule.enabled , ua-rule.allowed-linux , ua-rule.allowed-mobile , ua-rule.allowed-pc , ua-rule.allowed-iot , ua-rule.allowed-proxy

These options allow fine‑grained control over which clients are permitted and how request rates are limited.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

redisaxiosCaptchafilteranti‑crawlerspring-boot
Programmer DD
Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.