Automating Arithmetic Captcha Solving with Python, Requests, pytesseract, and Selenium
This guide explains how to programmatically download arithmetic captcha images, use OCR to extract and compute the expression, and automatically click the correct image on a website by combining Python requests, pytesseract, and Selenium for web automation.
The article describes a method to bypass arithmetic captchas used by a telecom operator by programmatically downloading captcha images, recognizing the expression, calculating the result, and automatically clicking the matching image.
Step 1 – Download captcha images : Using the requests library, multiple captcha pictures are fetched and saved locally. Example code:
import requests
def download_captcha_images(base_url, num_images, save_path):
for i in range(1, num_images+1):
captcha_url = f"{base_url}/{i}.png"
file_path = f"{save_path}/captcha{i}.png"
try:
response = requests.get(captcha_url, stream=True)
if response.status_code == 200:
with open(file_path, 'wb') as f:
for chunk in response.iter_content(1024):
f.write(chunk)
print(f"验证码图片 {i} 保存成功!")
else:
print(f"无法获取验证码图片 {i},状态码:{response.status_code}")
except requests.exceptions.RequestException as e:
print(f"请求验证码图片 {i} 时发生异常:{e}")
# 示例:获取多张验证码图片并保存到本地
base_url = "https://example.com/captcha" # 替换成实际的验证码图片URL基础部分
num_images = 5 # 假设需要获取5张验证码图片
save_path = "captcha_images" # 保存路径,根据实际情况设置
download_captcha_images(base_url, num_images, save_path)Step 2 – Recognize and compute the arithmetic expression : The pytesseract OCR library extracts the text from a captcha image, the numbers are parsed, and the sum is calculated. Example code:
from PIL import Image
import pytesseract
def recognize_captcha(image_path):
try:
text = pytesseract.image_to_string(Image.open(image_path))
# 假设验证码格式为 "x + y = ?"
x, y = map(int, text.split('+'))
result = x + y
return result
except Exception as e:
print("发生异常:", e)
return None
# 示例使用
captcha_image = "captcha.png" # 替换成实际的验证码图片路径
result = recognize_captcha(captcha_image)
if result is not None:
print(f"验证码算式的结果为:{result}")
else:
print("无法识别验证码,请检查验证码图片是否有效。")Step 3 – Automate verification and clicking with Selenium : Selenium drives a browser to load the target page, captures each captcha image, runs the OCR routine, compares the computed result with a condition, and performs a click when the condition is met. Example code:
from selenium import webdriver
from PIL import Image
import pytesseract
def recognize_captcha(image_path):
try:
text = pytesseract.image_to_string(Image.open(image_path))
x, y = map(int, text.split('+'))
return x + y
except Exception as e:
print("发生异常:", e)
return None
def main():
url = "https://example.com" # 替换成实际的网站URL
driver = webdriver.Chrome() # 需要对应版本的 ChromeDriver
driver.get(url)
for i in range(1, 6): # 假设有 5 张验证码图片
captcha_element = driver.find_element_by_xpath('//*[@id="captcha_image"]')
captcha_image = captcha_element.screenshot_as_png
captcha_image_path = f"captcha{i}.png"
with open(captcha_image_path, "wb") as f:
f.write(captcha_image)
captcha_result = recognize_captcha(captcha_image_path)
if captcha_result is not None:
print(f"验证码{i}算式的结果为:{captcha_result}")
if captcha_result == 10:
submit_button = driver.find_element_by_xpath('//*[@id="submit_button"]')
submit_button.click()
print(f"验证码{i}满足条件,执行点击操作。")
else:
print(f"验证码{i}无法识别,请检查验证码图片是否有效。")
driver.quit()
if __name__ == "__main__":
main()Additional notes remind readers to handle possible anti‑scraping measures, install required drivers, respect website terms of service, and consider more advanced image‑processing or deep‑learning techniques for complex captchas.
Test Development Learning Exchange
Test Development Learning Exchange
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.