Skill Detail

Kubernetes Incident Runbook

Executes structured incident response procedures for Kubernetes clusters using kubectl, kube-state-metrics, and the Kubernetes Events API. Automates pod crash diagnosis, OOMKill analysis, and node pressure triage.

Runbooks & DiagnosticsClaude Code

Runbooks & Diagnostics Claude Code Security Reviewed

Tool match: kubernetes ⭐ 121.7k GitHub stars Apache-2.0 license

INSTALL WITH ANY AGENT

npx skills add agentskillexchange/skills --skill kubernetes-incident-runbook Copy

Works best when you want a reusable capability, not another fragile one-off prompt.

View source

At a glance

Last updated

Mar 20, 2026

Quick brief

The Kubernetes Incident Runbook skill provides automated incident response procedures for Kubernetes cluster issues. It uses kubectl commands, the Kubernetes API, and kube-state-metrics to systematically diagnose common failure modes including CrashLoopBackOff, OOMKilled, ImagePullBackOff, and node NotReady conditions.

How it works

What this skill actually does

When triggered, the skill follows a structured diagnostic tree. For pod failures, it inspects container exit codes, retrieves previous container logs via kubectl logs –previous, checks resource requests/limits against actual usage from metrics-server, and examines events for scheduling or volume mount failures.

For node-level issues, it analyzes node conditions (MemoryPressure, DiskPressure, PIDPressure), checks kubelet logs, inspects systemd service status, and correlates with cloud provider instance health. The skill understands taints, tolerations, and affinity rules that may cause scheduling failures.

Advanced capabilities include tracing network connectivity issues using CoreDNS logs and NetworkPolicy analysis, diagnosing PersistentVolumeClaim binding failures across storage classes, and identifying resource quota exhaustion across namespaces. All findings are compiled into structured incident reports with remediation steps.

Best fit

When to reach for it

Best when the job fits Runbooks & Diagnostics.
Works naturally with Claude Code setups.

Trust & provenance

Why this listing is credible

Built around the kubernetes toolchain.
Trust status: Security Reviewed.
121.7k GitHub stars on the linked upstream source.
License: Apache-2.0.
Last updated Mar 20, 2026.

View source ↗