Backend Development 3 min read

Running Scrapy Spiders via Command Line, CrawlerProcess, and CrawlerRunner

This guide explains how to execute Scrapy spiders from the command line, within Python scripts using CrawlerProcess or CrawlerRunner, and how to manage multiple spiders efficiently, highlighting configuration steps, execution methods, and practical observations about middleware behavior.

Python Programming Learning Circle

May 27, 2021

Running Scrapy Spiders via Command Line, CrawlerProcess, and CrawlerRunner

1. Running a Spider from the Command Line

Create a spider file (e.g., baidu.py) and run it using two possible approaches shown in the screenshots.

2. Running a Spider Inside a Python File

Three methods are demonstrated:

• cmdline.execute – the simplest way to launch a single spider.

• CrawlerProcess – allows running a spider programmatically.

• CrawlerRunner – provides more control over the crawling process.

3. Running Multiple Spiders in One Project

Attempting to run multiple spiders with cmdline.execute fails because the process exits after the first spider finishes.

Two better alternatives are presented:

• Using CrawlerProcess to start several spiders concurrently, though middleware is initialized only once and requests are sent almost simultaneously, which may cause interference.

• Using CrawlerRunner to run spiders sequentially; middleware is still loaded once, but the sequential execution reduces interference and is recommended by the official Scrapy documentation.

Conclusion

The cmdline.execute method offers the simplest configuration for running a single spider repeatedly, while CrawlerProcess and CrawlerRunner provide more flexible solutions for handling multiple spiders with considerations for middleware behavior.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend-development Web Crawling CrawlerProcess CrawlerRunner

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.