Skill Detail

Scrapy Spider Architect

Generates Scrapy spider classes with CSS/XPath selectors, item pipelines, and middleware configurations for structured web scraping. Includes Scrapy-Splash integration for JavaScript-rendered content.

Research & ScrapingCustom Agents
Research & Scraping Custom Agents Security Reviewed
Tool match: scrapy โญ 61.3k GitHub stars BSD-3-Clause license
INSTALL WITH ANY AGENT
npx skills add agentskillexchange/skills --skill scrapy-spider-architect Copy
Works best when you want a reusable capability, not another fragile one-off prompt.
At a glance
Author
scrapy
Last updated
Mar 24, 2026
Quick brief

The Scrapy Spider Architect skill generates production-ready Scrapy spider classes for structured web data extraction. It creates CrawlSpider and Spider subclasses with optimized CSS and XPath selectors, configuring request callbacks, pagination handling, and link extraction rules.

How it works

What this skill actually does

The skill scaffolds complete Scrapy projects including items.py with Field definitions, pipelines.py for data cleaning and storage (MongoDB, PostgreSQL, Elasticsearch), and settings.py with tuned concurrency, download delays, and AutoThrottle configuration. It generates middleware for proxy rotation, user-agent randomization, and retry policies.

Advanced features include Scrapy-Splash integration for JavaScript-rendered single-page applications, Scrapy-Playwright for headless browser automation, and ItemLoader configurations with input/output processors for field normalization. The skill handles authentication flows (form login, cookie management, OAuth tokens), generates feed exporters for JSON Lines, CSV, and XML formats, and creates Scrapy contracts for automated spider testing.