⚡ Your Deployment Causes 30 Seconds of Downtime. What Went Wrong?

Fri, 20 Jun 2025 00:00:00 +0000

The question

“How do you achieve zero-downtime deployments in Kubernetes?”

The expected answer: rolling updates. That’s correct but incomplete. Rolling updates are the mechanism. They don’t give you zero downtime automatically — they give you a framework in which zero downtime is achievable, if you configure everything correctly.

Most clusters cause brief downtime on every deployment. Usually 5–30 seconds. Usually blamed on “the load balancer” or “DNS”. Almost always caused by one of four missing pieces.

The rolling update baseline

Kubernetes replaces pods in waves. You control the pace:

spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # how many extra pods can exist during update
      maxUnavailable: 0  # how many pods can be unavailable during update

maxUnavailable: 0 means Kubernetes never terminates a pod until a replacement is ready. This prevents the obvious failure mode where you have zero running pods mid-deployment.

maxSurge: 1 means one extra pod beyond the desired count runs during the update. For a deployment with 3 replicas, you’ll briefly have 4 pods running.

This alone doesn’t prevent downtime.

Piece 1: The readiness probe (the most common missing piece)

Kubernetes considers a pod “ready” when all its containers pass their readiness probes. If you don’t define a readiness probe, Kubernetes considers the pod ready as soon as the container starts. Containers start before applications are ready to serve traffic.

# Without this, traffic arrives before your app is listening
readinessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5
  failureThreshold: 3

What happens without it: Kubernetes starts the new pod, marks it ready immediately, adds it to the Service endpoints, routes traffic to it — while your app is still initialising (loading config, connecting to the database, warming caches). The first few requests to the new pod fail or time out.

The fix: define a readiness probe that actually checks application readiness. An HTTP endpoint that returns 200 only after the app has finished starting is the minimum. A deeper check that verifies the database connection is better.

Common mistake: using the same endpoint for liveness and readiness with the same thresholds. They serve different purposes:

Readiness: “am I ready to accept traffic?” — controls whether traffic is sent
Liveness: “am I still alive?” — controls whether the pod is restarted

A pod can fail its readiness probe (temporarily overloaded, warming up) without failing its liveness probe. If you make liveness too aggressive, Kubernetes restarts pods that would have recovered on their own.

Piece 2: The termination grace period (the other common missing piece)

When Kubernetes wants to terminate a pod, it sends SIGTERM. Your application has terminationGracePeriodSeconds (default: 30) to finish in-flight requests and shut down cleanly. After that, Kubernetes sends SIGKILL.

The problem: there’s a race condition. Kubernetes removes the pod from the Service endpoints and sends SIGTERM roughly simultaneously. The endpoint update has to propagate through the control plane, kube-proxy, and the load balancer. During that propagation window — typically 1–10 seconds — traffic can still arrive at a pod that has already started shutting down.

The fix is a preStop hook that adds a short sleep before the termination sequence:

lifecycle:
  preStop:
    exec:
      command: ["sleep", "5"]

This gives the endpoint removal time to propagate before your app receives SIGTERM. The total shutdown sequence is then:

Kubernetes removes pod from endpoints
preStop hook runs (sleep 5s — enough for endpoint propagation)
SIGTERM is sent
App drains in-flight requests and shuts down
If still running after terminationGracePeriodSeconds: SIGKILL

Set terminationGracePeriodSeconds to cover the sleep plus your app’s actual shutdown time:

spec:
  terminationGracePeriodSeconds: 60  # 5s preStop + up to 55s for app shutdown

Without the sleep: requests fail during the propagation window. With it: the window is covered.

Piece 3: PodDisruptionBudgets (for node maintenance)

Rolling updates handle normal deployments. Node drains (kubectl drain, cloud provider maintenance windows, k3s upgrades) are a different code path that bypasses your rolling update strategy entirely.

When a node is drained, Kubernetes evicts all pods on it as fast as it can. Without constraints, it will evict all replicas of your deployment simultaneously if they all happen to land on the same node.

A PodDisruptionBudget sets a floor:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: myapp-pdb
  namespace: myapp
spec:
  minAvailable: 1   # at least 1 replica must stay up during disruption
  selector:
    matchLabels:
      app: myapp

Now node drain will evict pods one at a time, waiting for replacement pods to come up before evicting the next one. If no replacement can be scheduled (e.g., you’re draining the only node), the drain will block rather than cause downtime.

minAvailable: 1 is the minimum. For production with 3+ replicas, minAvailable: 2 or maxUnavailable: 1 is more appropriate.

Piece 4: minReadySeconds (the one everyone forgets)

Even with a correct readiness probe, there’s a subtle risk: a pod that passes its readiness probe briefly and then fails due to a transient startup issue (flapping). Kubernetes would add it to the endpoint pool, route traffic to it, watch it fail the readiness probe, remove it — and during that window, some requests fail.

minReadySeconds says: a pod must pass its readiness probe continuously for this many seconds before Kubernetes considers it “available” and allows the next pod in the rolling update to be terminated:

spec:
  minReadySeconds: 10

This slows deployments slightly but catches flapping probes before they cause production traffic to hit an unstable pod.

The complete deployment snippet

Putting it together:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
  namespace: myapp
spec:
  replicas: 3
  minReadySeconds: 10
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      terminationGracePeriodSeconds: 60
      containers:
        - name: myapp
          image: myapp:latest
          lifecycle:
            preStop:
              exec:
                command: ["sleep", "5"]
          readinessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
            failureThreshold: 3
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 15
            periodSeconds: 10
            failureThreshold: 5

And the PDB alongside it:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: myapp-pdb
  namespace: myapp
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: myapp

What interviewers are actually testing

The follow-up is usually: “What if your new version has a bug that isn’t caught immediately — how do you roll back?”

kubectl rollout undo deployment/myapp reverts to the previous ReplicaSet. Kubernetes stores the last few ReplicaSets by default (revisionHistoryLimit, default 10). The rollback uses the same rolling update mechanism, so it’s also zero-downtime.

The harder follow-up: “What if the bug only shows up after 10 minutes of load?” That’s where you need a canary deployment — send a small percentage of traffic to the new version, observe, then shift the rest. Argo Rollouts handles this natively. Without it, you’re doing it manually with two Deployments and weighted Services.

This is part of a series on Kubernetes interview questions. Previously: secrets in a GitOps repo. Next: network isolation between services.

Deployments on hippotion