Runbooks & Diagnostics
Security Reviewed
Monitors AWS CloudFormation stacks for configuration drift using the AWS SDK DetectStackDrift and DescribeStackResourceDrifts APIs. Generates remediation templates and integrates with AWS Config rules for continuous compliance.
Gemini Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Executes automated diagnostics using the AWS Systems Manager Automation API and SSM Documents. Collects system metrics via the CloudWatch GetMetricData API and correlates with AWS Health events.
ChatGPT Agents Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Automates alert triage using the Datadog Monitors API v2 and Notebooks API. Correlates metrics with traces via the Datadog APM Trace Search API and generates RCA timelines from the Events Stream API.
MCP Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Performs deep cluster troubleshooting using the Kubernetes API server /debug/pprof endpoints and kubectl-debug ephemeral containers. Analyzes resource pressure via the Metrics Server API and kube-state-metrics.
Claude Code Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Executes structured incident response playbooks using PagerDuty Events API v2 for alerting, Slack Web API for communication, and Jira REST API for ticket creation. Automates evidence collection, timeline construction, and post-mortem generation.
MCPMulti-Framework Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Analyzes PostgreSQL query performance using pg_stat_statements, pg_stat_user_tables, and EXPLAIN ANALYZE output. Identifies missing indexes via pg_stat_user_indexes and detects lock contention through pg_locks and pg_stat_activity.
MCP Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Automates Kubernetes troubleshooting using kubectl and the Kubernetes Python client to diagnose CrashLoopBackOff, OOMKilled, and ImagePullBackOff states. Collects pod logs, events, node conditions, and resource quotas systematically.
OpenClaw Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Diagnoses CrashLoopBackOff pods using kubectl describe, container exit code analysis, and the Kubernetes Events API. Cross-references OOMKilled signals with Prometheus container_memory_rss metrics and cAdvisor stats for root cause identification.
Cursor Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Automates PostgreSQL vacuum and autovacuum troubleshooting via pg_stat_user_tables, pg_locks, and pg_stat_activity views. Detects table bloat using pgstattuple extension and generates remediation SQL for long-running transaction conflicts.
Claude Agents Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Diagnoses CrashLoopBackOff pods using the Kubernetes client-go API and kubectl debug. Analyzes container exit codes, OOMKill events, and liveness probe failures with automated remediation suggestions.
OpenClaw Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Performs safe Terraform state operations using the terraform CLI state subcommands and the Terraform Cloud API. Handles state imports, resource moves, and taint operations with automatic backup and rollback.
Claude Code Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Tunes Prometheus alerting rules using the Prometheus HTTP API and PromQL query analysis. Reduces alert fatigue by analyzing firing history, adjusting thresholds via histogram_quantile, and configuring inhibition rules.
Gemini Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Validates Ansible playbooks using ansible-lint with custom rule plugins and the Ansible Collections API. Checks for deprecated modules, missing handlers, insecure variable practices, and role dependency conflicts.
MCP Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Parses Nginx error logs and access logs to diagnose 502, 504, and 413 errors. Uses GoAccess for real-time log visualization and integrates with nginx -t for configuration validation.
Claude Agents Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Runs systematic health diagnostics on Docker containers using docker inspect, docker stats, and the Docker Engine API. Checks resource limits, network connectivity, and volume mount integrity.
Cursor Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Automated K8s pod troubleshooting using kubectl, crictl, and the Kubernetes API. Runs diagnostic sequences for CrashLoopBackOff, ImagePullBackOff, OOMKilled, and pending pod states.
Codex Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Diagnoses PostgreSQL performance issues using pg_stat_statements, pg_stat_activity, and EXPLAIN ANALYZE. Integrates with pgBadger for log analysis and pg_stat_user_tables for index recommendations.
MCP Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Runs diagnostic queries against PostgreSQL using pg_stat_statements, pg_stat_activity, and pg_locks system views. Identifies slow queries, lock contention, and bloat using pgstattuple and pg_repack extension analysis.
Gemini Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Executes structured Kubernetes rollback procedures using kubectl and the kubernetes/client-go library. Monitors rollout status via the apps/v1 Deployment API and triggers PagerDuty incidents through the PagerDuty Events API v2.
OpenClaw Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Constructs incident timelines from PagerDuty Events API v2, Datadog Monitors API, and Slack message archives. Correlates alerts with deployment events for root cause analysis.
OpenClaw Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Inspects and diagnoses Terraform state files using terraform CLI commands and the Terraform Cloud API v2. Detects drift, orphaned resources, and dependency cycles in state data.
Gemini Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Analyzes AWS CloudWatch Logs using the CloudWatch Logs API and Logs Insights query syntax. Identifies error patterns, calculates error rates, and generates metric filters from log data.
ChatGPT Agents Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Diagnoses and recovers failed systemd services using journalctl, systemctl status, and D-Bus org.freedesktop.systemd1 interface. Analyzes exit codes, dependency chains via list-dependencies, and resource limits from cgroup controllers.
ChatGPT Agents Runbooks & Diagnostics
Runbooks & Diagnostics
Security Reviewed
Runs automated diagnostic sequences on Kubernetes pods using kubectl exec, kubectl logs, and the Kubernetes API /api/v1/pods endpoint. Captures OOMKilled events, CrashLoopBackOff analysis, and resource utilization via metrics-server.
Cursor Runbooks & Diagnostics