Skill Detail

Puppeteer Web Scraper

Headless Chrome scraping via Puppeteer with automatic cookie handling, JavaScript rendering, and Cheerio-based DOM extraction. Handles infinite scroll and lazy-loaded content.

Research & ScrapingCursor
Research & Scraping Cursor Security Reviewed
Tool match: puppeteer โญ 94.1k GitHub stars โฌ‡ 40.2M/wk npm Apache-2.0 license
INSTALL WITH ANY AGENT
npx skills add agentskillexchange/skills --skill puppeteer-web-scraper Copy
Works best when you want a reusable capability, not another fragile one-off prompt.
At a glance
Author
Puppeteer
Last updated
Mar 24, 2026
Quick brief

This skill leverages Puppeteer to launch headless Chromium instances for scraping JavaScript-heavy websites that traditional HTTP clients cannot parse. It manages browser contexts, handles cookie consent dialogs, and waits for dynamic content to fully render.

How it works

What this skill actually does

The extraction pipeline uses Cheerio for fast DOM querying after page load, supporting CSS selectors and XPath expressions. Built-in strategies handle infinite scroll pages by monitoring DOM mutations and network idle states.

Features include proxy rotation via a configurable proxy pool, user-agent randomization from a curated list of real browser strings, and viewport emulation for responsive sites. The skill captures screenshots for debugging and exports data as JSON-LD, CSV, or feeds into a PostgreSQL database via pg-copy-streams for high-throughput ingestion.