Skill Detail
Benchmark browser agents on repeatable Playwright web tasks with Bananalyzer
Run a repeatable evaluation suite for browser agents against static web task snapshots instead of judging them from demos or one-off tests.
Browser AutomationMulti-Framework
Browser Automation
Multi-Framework
Security Reviewed
β 327 GitHub stars
INSTALL WITH ANY AGENT
npx skills add agentskillexchange/skills --skill benchmark-browser-agents-on-repeatable-playwright-web-tasks-with-bananalyzer
Works best when you want a reusable capability, not another fragile one-off prompt.
At a glance
Tools required
Python environment, Playwright browser runtime, pytest-based test execution, a custom AgentRunner implementation, example web task snapshots
Install & setup
Install the project dependencies, create a test file that implements the `AgentRunner` interface, then run `bananalyze` against that file or test directory to execute the evaluation suite.
Author
Reworkd
Publisher
Organization
Last updated
Apr 22, 2026
Quick brief
Use Bananalyzer when an agent team needs to evaluate a browser agent on repeatable web tasks with stored page snapshots and explicit expected outputs. Invoke this instead of using Playwright or a browser agent normally when the goal is benchmarking and comparison, not task completion itself. The scope boundary is clear: browser-agent evaluation on replayable web-task datasets, not a generic browser automation framework or product listing.