Skill Detail

Capture local screen and audio context so agents can search what happened on your device

Use Screenpipe when an agent needs private, local-first memory of what you saw or heard on your computer, including searchable screen text, app context, and transcripts, instead of relying on a chat-only memory layer.

Media & TranscriptionMulti-Framework
Media & Transcription Multi-Framework Security Reviewed
⭐ 18.2k GitHub stars ⬇ 13.2k/wk npm
INSTALL WITH ANY AGENT
npx skills add agentskillexchange/skills --skill capture-local-screen-and-audio-context-so-agents-can-search-what-happened-on-your-device Copy
Works best when you want a reusable capability, not another fragile one-off prompt.
At a glance
Tools required
Screenpipe desktop app or source build, local screen and audio permissions, sufficient local storage, and optionally an MCP-compatible agent client
Install & setup
Install Screenpipe from the upstream desktop releases on macOS or Windows, or build from source on Linux, grant the requested screen and audio permissions, then use its local search, pipes, or MCP integration to feed desktop context into agent workflows.
Author
screenpipe
Publisher
Organization
Last updated
Apr 14, 2026
Quick brief

Use Screenpipe when the missing context lives on the user’s desktop rather than inside the chat thread. It continuously captures screen changes and audio locally, extracts OCR and accessibility text, builds searchable history, and exposes that memory to automations, pipes, and MCP-aware agents that need to recall what happened on the machine. The scope boundary is strong enough to be skill-shaped: this is a local screen-and-audio context capture workflow for agent recall and automation, not a generic personal AI product listing and not merely a meeting recorder card.