How-to guideoperations10–12 minIntermediate

Writing Effective Runbooks for SRE Hand-off

Structure your runbooks for clarity, safety, and repeatability.

SREEng leadershipLast updated 2025-11-26
runbooksoperations
Share:

Recommended runbook structure

  • Summary and when to use this runbook.
  • Quick checks / triage steps.
  • Safe remediation steps with copy-paste commands where appropriate.
  • Escalation instructions and business impact notes.

Minimal YAML example

Runbook YAML skeleton
name: K8s ingress 5xx spike
match: |
  labels.service == "api" && labels.env == "prod"
steps:
  - title: Check current error rate
    command: kubectl -n ingress logs deploy/ingress-nginx --since=10m | grep "500"
  - title: Verify upstream pods
    command: kubectl -n api get pods -o wide