Backend Development 8 min read

Automating Arithmetic Captcha Solving with Python, Requests, pytesseract, and Selenium

This guide explains how to programmatically download arithmetic captcha images, use OCR to extract and compute the expression, and automatically click the correct image on a website by combining Python requests, pytesseract, and Selenium for web automation.

Test Development Learning Exchange

Jul 24, 2023

Automating Arithmetic Captcha Solving with Python, Requests, pytesseract, and Selenium

The article describes a method to bypass arithmetic captchas used by a telecom operator by programmatically downloading captcha images, recognizing the expression, calculating the result, and automatically clicking the matching image.

Step 1 – Download captcha images : Using the requests library, multiple captcha pictures are fetched and saved locally. Example code:

import requests

def download_captcha_images(base_url, num_images, save_path):
    for i in range(1, num_images+1):
        captcha_url = f"{base_url}/{i}.png"
        file_path = f"{save_path}/captcha{i}.png"
        try:
            response = requests.get(captcha_url, stream=True)
            if response.status_code == 200:
                with open(file_path, 'wb') as f:
                    for chunk in response.iter_content(1024):
                        f.write(chunk)
                print(f"验证码图片 {i} 保存成功！")
            else:
                print(f"无法获取验证码图片 {i}，状态码：{response.status_code}")
        except requests.exceptions.RequestException as e:
            print(f"请求验证码图片 {i} 时发生异常：{e}")

# 示例：获取多张验证码图片并保存到本地
base_url = "https://example.com/captcha"  # 替换成实际的验证码图片URL基础部分
num_images = 5  # 假设需要获取5张验证码图片
save_path = "captcha_images"  # 保存路径，根据实际情况设置

download_captcha_images(base_url, num_images, save_path)

Step 2 – Recognize and compute the arithmetic expression : The pytesseract OCR library extracts the text from a captcha image, the numbers are parsed, and the sum is calculated. Example code:

from PIL import Image
import pytesseract

def recognize_captcha(image_path):
    try:
        text = pytesseract.image_to_string(Image.open(image_path))
        # 假设验证码格式为 "x + y = ?"
        x, y = map(int, text.split('+'))
        result = x + y
        return result
    except Exception as e:
        print("发生异常：", e)
        return None

# 示例使用
captcha_image = "captcha.png"  # 替换成实际的验证码图片路径
result = recognize_captcha(captcha_image)
if result is not None:
    print(f"验证码算式的结果为：{result}")
else:
    print("无法识别验证码，请检查验证码图片是否有效。")

Step 3 – Automate verification and clicking with Selenium : Selenium drives a browser to load the target page, captures each captcha image, runs the OCR routine, compares the computed result with a condition, and performs a click when the condition is met. Example code:

from selenium import webdriver
from PIL import Image
import pytesseract

def recognize_captcha(image_path):
    try:
        text = pytesseract.image_to_string(Image.open(image_path))
        x, y = map(int, text.split('+'))
        return x + y
    except Exception as e:
        print("发生异常：", e)
        return None

def main():
    url = "https://example.com"  # 替换成实际的网站URL
    driver = webdriver.Chrome()  # 需要对应版本的 ChromeDriver
    driver.get(url)
    for i in range(1, 6):  # 假设有 5 张验证码图片
        captcha_element = driver.find_element_by_xpath('//*[@id="captcha_image"]')
        captcha_image = captcha_element.screenshot_as_png
        captcha_image_path = f"captcha{i}.png"
        with open(captcha_image_path, "wb") as f:
            f.write(captcha_image)
        captcha_result = recognize_captcha(captcha_image_path)
        if captcha_result is not None:
            print(f"验证码{i}算式的结果为：{captcha_result}")
            if captcha_result == 10:
                submit_button = driver.find_element_by_xpath('//*[@id="submit_button"]')
                submit_button.click()
                print(f"验证码{i}满足条件，执行点击操作。")
        else:
            print(f"验证码{i}无法识别，请检查验证码图片是否有效。")
    driver.quit()

if __name__ == "__main__":
    main()

Additional notes remind readers to handle possible anti‑scraping measures, install required drivers, respect website terms of service, and consider more advanced image‑processing or deep‑learning techniques for complex captchas.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python OCR Captcha Selenium requests Web Automation

Written by

Test Development Learning Exchange

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.