Runbooks & Diagnostics
Security Reviewed
Uses git-sizer to identify the specific size and history characteristics that make a repository painful to clone, fetch, repack, or work in. Use it when an agent needs evidence about large blobs, oversized trees, too many refs, or other Git pathologies before proposing cleanup.
Multi-Framework Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Uses Goss to express the expected state of a machine or container, then validates that reality still matches the contract. Reach for it after provisioning, image builds, or config changes when an agent needs a fast pass or fail answer about service health and system drift.
Multi-Framework Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Use git-filter-repo when an agent needs to surgically rewrite repository history after a leaked secret, a huge binary commit, or a bad subtree split. The agent analyzes the problem, builds the rewrite command, and leaves a clean follow-up checklist for force-push, clone reset, and downstream cleanup.
Multi-Framework Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
This ASE skill uses ghz to run repeatable gRPC load tests from proto files, protosets, or server reflection. An agent can replay request fixtures at controlled concurrency, capture latency and error rates, and export machine-readable reports for regression checks or performance investigations.
Multi-Framework Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Use GitHub Next's pr-fix workflow when a pull request is blocked on failing checks and the likely repair is machine-doable. The agent inspects CI failures, traces the root cause, applies a focused fix on the PR branch, and leaves the result in reviewable Git history.
Multi-Framework Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Use Toxiproxy when an agent needs to inject latency, disconnects, bandwidth limits, or packet-like failure modes into real service calls during development, CI, or incident reproduction. The agent routes app traffic through controlled TCP proxies, applies toxics at the right moment, and reports which dependency paths fail gracefully versus which ones crack under stress.
Multi-Framework Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Use Qdrant's official qdrant-search-quality skill when an agent needs to diagnose weak recall, irrelevant matches, or embedding and chunking mistakes in a live retrieval pipeline. It is a bounded search-quality tuning workflow, not a generic database listing.
Multi-Framework Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
This skill guides an agent through measuring, profiling, and narrowing slow WordPress behavior without relying on browser clicks. Use it when the job is to diagnose slow pages, REST endpoints, cron activity, autoload bloat, or query-heavy requests from the backend outward.
Multi-Framework Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Lets an agent drive existing tmux sessions by sending keystrokes and scraping pane output, which is exactly what you need for interactive CLIs that cannot be handled as one-shot shell commands. Use it for session supervision and intervention, not for general terminal automation or starting new background jobs.
OpenClaw Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Guides an agent through the exact route, pairing, and auth checks needed when an OpenClaw companion node fails to connect over LAN, Tailscale, or a public URL. Use it when a node setup is broken and you need diagnosis, not when you simply want to list devices or advertise OpenClaw itself.
OpenClaw Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
BorgBackup (Borg) is a deduplicating backup program with optional compression and authenticated encryption. It uses content-defined chunking for space-efficient daily backups, making it ideal for automating secure incremental backups to local or remote SSH targets.
Multi-Framework Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Investigates broken checks with the Datadog Synthetics API, Monitors API, and Logs Search API to connect failed browser or API tests with the signals that explain them. Handy for turning a red synthetic check into an actionable diagnosis instead of a vague outage alarm.
⭐ 158 datadog
Claude Code Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Diagnoses restart storms with the Kubernetes Events API, Pod status conditions, and the Metrics API to explain why workloads are stuck in CrashLoopBackOff. Great for agents that need to summarize cluster evidence before an operator starts digging through kubectl output by hand.
⭐ 121.4k kubernetes
MCP Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Coordinates remediation playbooks with AWS Systems Manager Automation, Incident Manager, and CloudWatch alarm context for repeatable operational recovery. Useful for agents that need to recommend or launch the right runbook when alarms cross into known failure territory.
ChatGPT Agents Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Builds incident runbooks around the PagerDuty Events API v2, Incidents API, and Response Plays so agents can classify alerts, enrich context, and drive consistent handoffs. Useful when noisy monitoring signals need a repeatable escalation flow instead of ad hoc human triage.
OpenClaw Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Diagnoses CrashLoopBackOff pods using kubectl and the Kubernetes API /api/v1/namespaces/{ns}/pods/{pod}/log endpoint. Correlates container exit codes with OOM kills, readiness probe failures, and config errors.
Gemini Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Validates Ansible playbooks in check mode using ansible-playbook --check --diff and the Ansible Python API. Detects idempotency issues, undefined variables, and unreachable hosts before production runs.
Cursor Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Analyzes Nginx error logs using GoAccess and custom regex parsers to identify recurring 502/503 patterns. Correlates upstream timeout errors with backend service health via Prometheus PromQL queries.
Custom Agents Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Automatically executes diagnostic runbooks when PagerDuty incidents trigger, using the PagerDuty Events v2 API and Rundeck API. Attaches diagnostic output as incident notes and suggests remediation actions.
OpenClaw Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Triages AWS CloudWatch alarms by correlating alarm state changes with CloudTrail events and EC2 instance health using boto3. Classifies alarms by severity, identifies root cause candidates, and updates OpsGenie alerts.
Cursor Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Diagnoses Kubernetes pod crash loops by analyzing events, logs, and resource quotas via the Kubernetes API and kubectl debug. Correlates OOMKill signals with container memory profiles from Prometheus queries.
Claude Code Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Runs diagnostic analysis on Kubernetes clusters using kubectl, k9s terminal UI data, and the Troubleshoot.sh support-bundle collector framework. Generates remediation steps for common pod scheduling, networking, and storage failures.
ChatGPT Agents Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Analyzes Terraform state files and plan outputs to detect drift, orphaned resources, and dependency cycles. Uses the Terraform CLI state commands, tfsec for security scanning, and Infracost API for cost impact analysis.
Cursor Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Generates automated incident response runbooks triggered by PagerDuty webhooks via the PagerDuty Events API v2. Integrates with Datadog API and AWS CloudWatch for diagnostic data collection during incidents.
OpenClaw Runbooks & Diagnostics