The silence
My house runs on quiet little robots. A tracker watches my kombucha ferment. A job narrates kids’ books in Hungarian. A media stack pulls and files things. Home Assistant minds the sensors. A dozen services, all doing their jobs, all completely mute. When a batch finished or an import failed, I found out the same way every time: by going to look.
Then the silence got expensive. Claude Code stopped dead in the middle of a task because I’d burned through my plan’s usage window — no warning, no countdown, just a wall. The information existed; a dashboard in my own cluster was already polling it. It just had no way to reach my pocket.
So I built one thing: a push bus. One place anything in the cluster can POST to, that actually buzzes my phone. And the first job I gave it was to warn me before my AI assistant goes dark.
The boring part (said honestly)
The bus is ntfy — a self-hosted pub/sub notifier. Picking it took about five minutes, because self-hosting ntfy for a homelab is a thoroughly solved problem. There are at least three off-the-shelf bridges from Prometheus Alertmanager to ntfy. I’m not going to pretend the bus is the clever bit.
What I did do deliberately:
- 📦 Deployed it GitOps-native — one entry in my app-of-apps, reconciled by
Argo CD, no
docker runanywhere. - 🔒 Locked it to deny-all auth with bearer tokens. Security alerts ride this bus; a world-readable topic on a public URL was a non-starter. (Which also means it sits outside my usual OAuth gate — the phone app can’t do an interactive login flow, so ntfy does its own token auth.)
- 🏷️ Topics by severity:
hl-crit,hl-warn,hl-info,hl-event. Subscribe and mute by how much I care.
Then the interesting parts showed up at the edges, where they always do.
Edge one: my own firewall 403’d me
First test, the usage producer POSTing to https://ntfy.hippotion.com:
HTTP 403 Forbidden
error code: 1010
That 1010 looks like ntfy rejecting my token. It isn’t. It’s Cloudflare.
Error 1010 means “your browser signature is banned” — Cloudflare’s bot protection
took one look at a Python script’s urllib User-Agent and slammed the door.
My own producer couldn’t reach my own bus, because the request left the cluster, went all the way out to my own edge, and got flagged as a bot on the way back in.
The fix is the architecture I should’ve had from the start: in-cluster producers POST to the internal service address and never touch the public internet at all.
# wrong: out to Cloudflare and back, gets bot-blocked
https://ntfy.hippotion.com/hl-warn
# right: stays inside the cluster
http://ntfy.web-ntfy.svc.cluster.local/hl-warn
The phone still uses the public URL happily — the real ntfy app carries a signature Cloudflare trusts. Only scripts trip 1010. Lesson: your own edge is not your friend when you’re a script. Keep cluster traffic in the cluster.
Edge two: the obvious data source was lying
To warn me about Claude usage, the naïve move is to parse Claude Code’s local
logs — they sit right there in ~/.claude/projects/.../*.jsonl, token counts and
all.
Don’t. Those counts are unreliable for accounting — known to undercount, wildly, in some cases by ~100x. Every tool that parses that JSONL inherits the bug.
The number that’s actually true lives in the claude.ai usage API — the same
five_hour and seven_day windows your plan enforces against. And I already had
a service polling exactly that. So the producer is just a tiny sidecar on that
existing pod, reading its /api/usage over localhost (same pod — no network
policy to negotiate, no second credential, nothing else hammering claude.ai):
- 📈 ≥80% of a window →
hl-warn(high). - 🚨 ≥95% →
hl-crit(urgent). - 🔁 One ping per window per reset cycle, escalating warn→crit, keyed on the reset timestamp so it never spams.
The first time it mattered, my phone buzzed at 80% with hours of runway left instead of a brick wall mid-task.
What I’d tell past me
Three things, none of them about ntfy:
- Reuse the signal you already have. I didn’t build a usage poller — I bolted a sidecar onto the one already running. The smallest producer is one that reads localhost.
- Your own edge can betray you. A firewall that protects you from bots will happily block your own automation. In-cluster talks in-cluster.
- Check whether your data source is telling the truth before you build an alert on it. An alert you don’t trust is worse than no alert — you’ll learn to ignore it, and then it’ll be right once.
Next, the high-leverage move: point Prometheus Alertmanager at the same bus, and every infra alert I have — plus every one I’ll ever add — lands on the phone through one bridge. The kombucha ping can wait. The disk-full one can’t.
The house is still full of quiet robots. The difference is now they know my number.
