Skip to content

2026-01

The Last Service Account Key

$ git log --all --oneline -- '**/service-account.json' | wc -l
47

$ git log --all --oneline -- '**/service-account.json' | head -1
a3f8c2e delete: remove production service account key

That commit sits in your history like a monument. Not because of what it added, but because of what it finally took away. Forty-seven commits that existed only to move secrets around, rotate them, revoke them, apologize for them, and eventually eliminate them.

That last deletion was the sound of the door closing on an entire class of infrastructure vulnerability.

The Architecture That Couldn't Be Breached

Container escape achieved. Attacker privilege: still none. Why?

The breach happened. The forensics confirmed it. Shellcode executed inside the container. Root user, full system access, network connectivity. All compromised. Everything the attacker needed to pivot was there.

None of it worked.

The escaped container had no network access to other services. Secrets were never mounted into the pod. The attacker had no credential to steal. The host firewall blocked outbound connections. The network policy denied access to the control plane. The RBAC denied any service account permissions.

The container was compromised. The architecture was not.

This is what defense in depth looks like when it actually works.

The GKE Cluster That Nobody Could Break

Day 1 of pentest. Security firm arrives with methodology, tools, and confidence. The plan is simple: find gaps in the Kubernetes cluster, prove impact, deliver a detailed report of findings.

Day 2. They're quiet. Too quiet.

Day 3. Meeting request. Not the kind where they show you their findings.

"We found nothing. Well, nothing critical. Actually, we found nothing at all. This is the best-hardened cluster we've tested. Want to know what you did right?"

That's not how pentest reports usually end.

The 3am Incident That Followed The Playbook

3:17am. The pager vibrates on the nightstand. Half asleep, hand fumbles for phone. The message is three lines. Pod restart storms. API latency spiking. Customers seeing timeouts.

The engineer's first thought isn't "oh god, what now." It's automatic: "open the runbook."

Muscle memory takes over. Hands pull up a laptop still warm from yesterday. The playbook is right there: decision tree, diagnostic steps, escalation paths. No thinking required. Just follow the checklist.

Twenty-three minutes later, the incident is closed. Every step documented. The postmortem writes itself.

This is what happens when you stop improvising and start automating response.