Skill Detail

Drive web and app UIs with vision-grounded steps when selectors are brittle or unavailable

Use Midscene.js when an agent needs screenshot-grounded UI actions and assertions across web, mobile, or desktop surfaces where DOM selectors are fragile, unavailable, or not the right abstraction.

Browser AutomationMulti-Framework

Browser Automation Multi-Framework Security Reviewed

⭐ 12.6k GitHub stars ⬇ 83.7k/wk npm

INSTALL WITH ANY AGENT

npx skills add agentskillexchange/skills --skill drive-web-and-app-uis-with-vision-grounded-steps-when-selectors-are-brittle-or-unavailable

Copy

Works best when you want a reusable capability, not another fragile one-off prompt.

View source Documentation

At a glance

Tools required

Midscene.js, Node.js, a supported vision model, and a target automation surface such as Playwright, Puppeteer, Android adb, or iOS WebDriverAgent

Install & setup

Install the core package with `npm install @midscene/core`, connect it to your browser or device automation surface using the upstream setup guide, then author natural-language UI actions, assertions, and extraction steps through the SDK, YAML flow, or playground tooling.

Author

web-infra-dev

Publisher

Organization

Last updated

Apr 14, 2026

Quick brief

Use Midscene.js when the workflow depends on visual understanding instead of stable selectors. It lets an agent describe goals in natural language, operate interfaces through screenshot-based localization, extract data, assert outcomes, and replay runs across browser, Android, iOS, and other UI surfaces. The scope boundary is specific enough to avoid being just another browser framework listing: this skill is for vision-driven UI action authoring and debugging when selector-first automation breaks down, not for promoting a general product platform.

Best fit

When to reach for it

Best when the job fits Browser Automation.
Works naturally with Multi-Framework setups.
Requires Midscene.js, Node.js, a supported vision model, and a target automation….
Installation is straightforward: Install the core package with `npm install @midscene/core`, connect it to your browser or device automation…

Trust & provenance

Why this listing is credible

Trust status: Security Reviewed.
12.6k GitHub stars on the linked upstream source.
83.7k/week npm downloads recorded.
Last updated Apr 14, 2026.

View source ↗ Documentation ↗