Skill Detail

Benchmark browser agents on repeatable Playwright web tasks with Bananalyzer

Run a repeatable evaluation suite for browser agents against static web task snapshots instead of judging them from demos or one-off tests.

Browser AutomationMulti-Framework

Browser Automation Multi-Framework Security Reviewed

⭐ 327 GitHub stars

INSTALL WITH ANY AGENT

npx skills add agentskillexchange/skills --skill benchmark-browser-agents-on-repeatable-playwright-web-tasks-with-bananalyzer

Copy

Works best when you want a reusable capability, not another fragile one-off prompt.

View source

At a glance

Tools required

Python environment, Playwright browser runtime, pytest-based test execution, a custom AgentRunner implementation, example web task snapshots

Install & setup

Install the project dependencies, create a test file that implements the `AgentRunner` interface, then run `bananalyze` against that file or test directory to execute the evaluation suite.

Author

Reworkd

Publisher

Organization

Last updated

Apr 22, 2026

Quick brief

Use Bananalyzer when an agent team needs to evaluate a browser agent on repeatable web tasks with stored page snapshots and explicit expected outputs. Invoke this instead of using Playwright or a browser agent normally when the goal is benchmarking and comparison, not task completion itself. The scope boundary is clear: browser-agent evaluation on replayable web-task datasets, not a generic browser automation framework or product listing.

Best fit

When to reach for it

Best when the job fits Browser Automation.
Works naturally with Multi-Framework setups.
Requires Python environment, Playwright browser runtime, pytest-based test execution, a custom….
Installation is straightforward: Install the project dependencies, create a test file that implements the `AgentRunner` interface, then run `bananalyze`…

Trust & provenance

Why this listing is credible

Trust status: Security Reviewed.
327 GitHub stars on the linked upstream source.
Last updated Apr 22, 2026.

View source ↗