Artificial Intelligence 4 min read

Solving Image Captchas in Selenium Automation with Python and OCR

This tutorial demonstrates how to use Python's urllib to download captcha images, apply pytesseract OCR for text extraction, and integrate the result into Selenium scripts to automate the entry of image captchas during web testing.

Test Development Learning Exchange

Jan 4, 2024

Solving Image Captchas in Selenium Automation with Python and OCR

Selenium automation testing is convenient, but handling image captchas that require human interaction can be problematic. This article shows how to overcome this limitation by using Python scripts to download, recognize, and input captcha text automatically.

First, use Python's built‑in urllib.request module to fetch the captcha image and save it locally:

import urllib.request
def download_image(url):
    image_name = "./captcha.png"  # define local path and filename
    urllib.request.urlretrieve(url, image_name)  # download the captcha image
    return image_name

After obtaining the image, install and use the pytesseract library together with the Tesseract OCR engine to extract the characters. Install the dependencies:

# Install pytesseract
!pip install pytesseract
# Install tesseract-OCR (Windows users must download manually)
!apt-get install tesseract-ocr

Then apply OCR to the saved image:

import pytesseract
from PIL import Image

def handle_image(image_path):
    image = Image.open(image_path)  # open the captcha image
    image = image.convert('L')     # convert to grayscale
    threshold = 127               # binary threshold
    image = image.point(lambda x: 0 if x < threshold else 255)  # binarize
    result = pytesseract.image_to_string(image)  # extract text
    return result

Finally, feed the recognized text back into the web page using Selenium:

from selenium import webdriver
url = "http://www.example.com"  # test page URL
driver = webdriver.Chrome()    # launch Chrome
driver.get(url)                # open the page
image_url = "http://www.example.com/getCaptcha"  # captcha image URL
# Download and process the captcha
capcha = handle_image(download_image(image_url))
# Input the captcha via Selenium
capcha_input = driver.find_element_by_xpath("//input[@id='captcha']")
capcha_input.send_keys(capcha)

The complete code example illustrates a practical way to automate image captcha handling in Selenium tests, while noting that different sites may require customized recognition strategies.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python Automation OCR Captcha Selenium pytesseract

Written by

Test Development Learning Exchange

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.