Runbooks & Diagnostics
Security Reviewed
Diagnoses failed AWS CloudFormation stack operations using the AWS CLI (aws cloudformation describe-stack-events) and cfn-lint validator. Traces resource creation failures, rollback causes, and nested stack dependency chains.
ChatGPT Agents Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Diagnoses systemd service failures using journalctl structured JSON output and systemctl show properties. Analyzes unit file configurations with systemd-analyze verify and detects dependency ordering issues via systemd-analyze dot.
OpenClaw Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Analyzes PostgreSQL slow queries using EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) output and pg_stat_statements views. Identifies missing indexes via pg_stat_user_tables sequential scan counters and suggests index creation with HypoPG extension.
Gemini Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Diagnoses CrashLoopBackOff and OOMKilled pod failures using the Kubernetes API via kubectl and the official kubernetes-client/python SDK. Correlates container logs, resource limits, and node conditions for root cause analysis.
Codex Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Validates nginx.conf files using the gixy static analyzer and crossplane parser library. Tests configuration for security misconfigs, HTTP header issues, and performs dry-run validation via nginx -t subprocess invocation.
ChatGPT Agents Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Detects infrastructure drift using terraform plan -detailed-exitcode and the Terraform Cloud API. Compares state files against live resources across AWS, GCP, and Azure providers.
MCP Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Automates incident response runbooks using the PagerDuty Events API v2 and REST API. Manages incident creation, escalation policies, and automated diagnostics triggered by alert severity.
Gemini Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Runs Ansible diagnostic playbooks using ansible-runner and the Ansible Collections ecosystem (ansible.builtin, community.general). Captures system health, service status, and log analysis across inventory hosts.
Gemini Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Triages application errors using the Sentry Web API (/api/0/issues/) and Sentry SDK breadcrumb data. Groups issues by stack trace similarity using Sentry fingerprinting rules and queries release health via the /api/0/organizations/{org}/releases/ endpoint.
Claude Agents Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Monitors Docker Compose service health using the Docker Engine API (/containers/{id}/json) and docker-compose ps parsing. Tracks container restart counts via the RestartCount field and logs analysis through the /containers/{id}/logs endpoint.
MCP Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Executes diagnostic runbooks against Kubernetes clusters using the official kubernetes/client-go SDK and kubectl commands. Checks pod health via the /healthz and /readyz endpoints and analyzes events with the CoreV1 Events API.
OpenClaw Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Diagnoses slow PostgreSQL queries using pg_stat_statements, pg_stat_activity, and EXPLAIN ANALYZE output parsing. Integrates with the pgBadger log analyzer and pg_stat_user_tables for index recommendation.
Claude Agents Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Validates docker-compose.yml files against the Compose Specification, checks image vulnerability status via Docker Scout API, and verifies healthcheck configurations.
MCP Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Automates incident response for Prometheus alerts using PromQL queries, Alertmanager API, and Grafana dashboards. Maps alerts to diagnostic runbooks with remediation steps.
Gemini Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Manages GitOps deployments using ArgoCD API, argocd CLI, and Kustomize overlays. Automates sync operations, rollback procedures, and application health monitoring.
Custom Agents Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Diagnoses CrashLoopBackOff pods in Kubernetes clusters using kubectl and the Kubernetes API. Fetches pod events, container logs, and resource limits via the /api/v1/namespaces/{ns}/pods/{name}/log endpoint. Provides structured root-cause analysis covering OOMKilled, missing ConfigMaps, failed liveness probes, and image pull errors.
OpenClaw Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Diagnoses ArgoCD application sync failures and degraded states using the ArgoCD REST API and argocd CLI. Queries /api/v1/applications/{name} for sync status, resource health, and operation state. Provides automated remediation steps for OutOfSync, Degraded, and Missing resource conditions.
OpenClaw Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Diagnoses Kafka consumer group lag using the Kafka AdminClient API and JMX metrics exposed via the Confluent Metrics API. Identifies slow consumers, topic partition hotspots, and broker rebalance storms that contribute to lag growth. Provides a step-by-step runbook to tune fetch.min.bytes, max.poll.records, and partition count.
Cursor Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Diagnoses firing AWS CloudWatch alarms by querying CloudWatch Metrics, alarm history, and related AWS Config resource snapshots via the AWS SDK. Correlates metric anomalies with recent infrastructure changes to suggest root cause hypotheses. Outputs a structured incident summary with remediation options.
Gemini Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Automates diagnosis of CrashLoopBackOff pods using kubectl commands wrapped via the Kubernetes API server. Fetches recent events, container logs, and resource quota status to identify root causes such as OOMKilled, misconfigured liveness probes, or missing ConfigMaps. Generates a step-by-step remediation runbook.
Claude Agents Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Parses Jenkins build console logs via the Jenkins Remote Access API to extract failure patterns, stack traces, and flaky test signatures. Uses regex heuristics and the Jenkins Test Results API to correlate failures with specific changes. Outputs a triage report ranked by recurrence frequency.
ChatGPT Agents Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Runs pulumi refresh on schedule to detect drift between live cloud resources and Pulumi state. Classifies drift by severity and opens a Jira ticket for destructive changes. Non-destructive drift is auto-reconciled via pulumi up --target for specific resources.
Codex Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Executes Ansible playbooks against dynamic inventories from AWS EC2 or Azure, decrypting Ansible Vault secrets via HashiCorp Vault KV v2 API. Streams task output in real time and posts a per-host pass/fail summary to Slack. Supports --check mode for dry-run validation before live runs.
⭐ 68.3k ansible
Claude Code Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Queries PagerDuty to show who is currently on-call for each escalation policy, surfaces unacknowledged incidents, and identifies schedule coverage gaps for the next 7 days. Useful for handoff checks and pre-weekend coverage audits. Read-only skill.
Claude Code Runbooks & Diagnostics