<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Security on hippotion</title><link>https://blog.hippotion.com/tags/security/</link><description>Recent content in Security on hippotion</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Fri, 12 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://blog.hippotion.com/tags/security/index.xml" rel="self" type="application/rss+xml"/><item><title>Two Birds That Read the Web for Me: One Hoards, One Scatters</title><link>https://blog.hippotion.com/posts/two-birds-read-the-web/</link><pubDate>Fri, 12 Jun 2026 00:00:00 +0000</pubDate><guid>https://blog.hippotion.com/posts/two-birds-read-the-web/</guid><description>I gave my second brain two agents that read the outside world and collide it against my notes. A Magpie watches my GitHub stars and only speaks when something hits live work. A Blue Jay reads a handful of RSS feeds and surfaces the distant, not-yet-relevant connection. They share a security spine — and they have deliberately opposite jobs. Here&amp;rsquo;s why the split is the whole design.</description><content:encoded><![CDATA[<p>I have a <a href="/posts/a-second-brain-you-can-git-clone/">vault of markdown notes</a> that
I treat as a second brain, and I <a href="/posts/gitops-for-my-brain/">run GitOps over it</a>
like it&rsquo;s production infrastructure. It already has agents that work on it from
the <em>inside</em>: a <a href="/posts/an-ai-gardener-for-your-second-brain/">nightly gardener</a>
that weeds orphans and suggests links, and a Wanderer that collides random pairs
of my own notes looking for connections I missed.</p>
<p>The obvious next move is to point an agent at the <em>outside</em> — let it read the web
and tell me what matters. That move is also a small landmine, and most &ldquo;AI reads
the internet for you&rdquo; tooling steps right on it. So this week I built two of them
instead of one, named them after corvids, and the reason there are two is the
entire point of this post.</p>
<p>Meet the <strong>Magpie</strong> and the <strong>Blue Jay</strong>.</p>
<h2 id="the-same-fear-twice">The same fear, twice</h2>
<p>Before either bird got a name, both inherited a single non-negotiable rule, and
it&rsquo;s worth saying plainly because it&rsquo;s the part everyone skips:</p>
<blockquote>
<p>An agent that reads the internet and writes to your notes is a prompt-injection
pipeline aimed straight at your trust root.</p>
</blockquote>
<p>My vault isn&rsquo;t just storage. Every <em>other</em> agent — the gardener, the Wanderer,
the search that answers &ldquo;what am I building?&rdquo; — reads it as <strong>trusted context</strong>.
So the moment one agent ingests a GitHub README or a news headline (attacker-
influenceable text) and is allowed to write a note, a stranger on the internet
gets to whisper instructions into the thing my whole system believes. &ldquo;Structured
API&rdquo; narrows that surface. It does not close it.</p>
<p>Both birds are built on the same chassis as the gardener, and that chassis
<em>enforces</em> the fear rather than trusting the model to behave:</p>
<ul>
<li><strong>Two phases, hard split.</strong> A wrapper-owned <strong>FETCH</strong> step pulls the external
text in plain Bash — Claude is not in the loop, can&rsquo;t be talked into anything,
because it isn&rsquo;t running yet. Then a <strong>COLLIDE</strong> step starts <code>claude -p</code> with
the fetched text handed in as inline <em>data</em>, and that process gets only
<code>Read</code> / <code>Glob</code> / <code>Grep</code> / <code>Write</code>. <strong>No Bash, no git, no network, no MCP.</strong>
While untrusted text is in the context window, the agent has no tool that can
reach the outside world or rewrite history.</li>
<li><strong>Allowlist, not the open web.</strong> Each bird reads a short, named list of
sources. Nothing else.</li>
<li><strong>Quarantine, not the vault.</strong> Findings land in <code>quarantine/&lt;bird&gt;/</code>, which
lives <em>outside</em> <code>vault/</code>. The indexer never sees it. Nothing it writes is ever
auto-wikilinked into the graph. Promotion to a real note is a thing <strong>I</strong> do,
by hand, after reading it.</li>
<li><strong>Blast radius is checked, not assumed.</strong> A run may modify <em>only</em> its quarantine
directory. Anything written anywhere else is discarded and reported as a
<code>violation</code>.</li>
<li><strong>&ldquo;Nothing found&rdquo; is a successful run.</strong> Neither bird has a quota. This is the
honesty contract I stole from the Wanderer — an agent under pressure to produce
N findings will manufacture N findings, and manufactured insight is worse than
silence.</li>
</ul>
<p>That&rsquo;s the shared spine. Now the interesting part: given the <em>same</em> security
model, the two birds do almost opposite things, and trying to make one bird do
both jobs would have quietly ruined it.</p>
<h2 id="the-magpie-hoards-whats-already-shiny">The Magpie hoards what&rsquo;s already shiny</h2>
<p>A magpie collects shiny objects and keeps them close. Mine watches <strong>my own
GitHub stars</strong>.</p>
<p>The premise is <em>slow public signal × private context</em>. I starred some repo three
weeks ago, forgot about it, and moved on. Meanwhile my projects shifted. The
Magpie runs weekly, pulls my starred repos through one allowlisted endpoint
(<code>gh api user/starred</code>), and collides each one against what I&rsquo;m <strong>actively
building right now</strong> — the live projects, the open hubs.</p>
<p>Its output contract is a tight one: it is a <strong>relevance filter</strong>. It fires <em>only</em>
when a star actually touches live work, and every finding has to name three
concrete things — the repo, the project it connects to, and one &ldquo;so what.&rdquo; A
vague &ldquo;these are thematically related&rdquo; doesn&rsquo;t count as a hit. It&rsquo;s a watchdog on
the dials, not a newsletter.</p>
<p>The supervised proof run, over 28 stars, surfaced exactly two real hits and
refused to invent a third:</p>
<ol>
<li><strong><code>supertonic</code> (on-device multilingual TTS)</strong> × my
<a href="/posts/clone-your-voice-hungarian-audiobooks/">Hungarian-audiobook voice-cloning project</a>
— a possible escape from a TTS fight I&rsquo;d been losing. I checked: it genuinely
supports Hungarian. That&rsquo;s a hit with a so-what.</li>
<li><strong><code>agentmemory</code></strong> × the exocortex itself — prior art for persistent AI memory,
notably <em>with benchmarks</em> my own notes lacked. (And if you&rsquo;ve read about
<a href="/posts/graph-hurt-my-search/">the time I benchmarked my own search and it lost</a>,
you&rsquo;ll know how much I needed that nudge.)</li>
</ol>
<p>The other ~22 stars mapped to tidy thematic clusters and were correctly <em>not</em>
reported. That restraint is the feature.</p>
<h2 id="the-blue-jay-scatters-acorns-and-forgets-where">The Blue Jay scatters acorns and forgets where</h2>
<p>Here&rsquo;s the bird that explains why there are two.</p>
<p>Blue jays don&rsquo;t hoard close like magpies. They <strong>cache acorns far and wide and
forget where they buried some</strong> — and the forgotten ones grow into oak trees.
Ecologists think blue jays are why oak forests spread north after the last ice
age. Seed dispersal, by way of a bad memory. That is <em>exactly</em> the job I wanted
for the second bird, and the metaphor was too good to pass up.</p>
<p>The Blue Jay reads an allowlist of <strong>eight RSS feeds</strong>, picked so tech and science
cross-pollinate:</p>
<ul>
<li><strong>Tech:</strong> Hacker News (high-score front page), lobste.rs, Ars Technica</li>
<li><strong>Science &amp; ideas:</strong> phys.org, Quanta, Aeon, Nautilus</li>
<li><strong>Wildcard:</strong> Medium — but scoped to specific <em>tag</em> feeds, never the raw
firehose of crypto and self-help</li>
</ul>
<p>Quanta, Aeon, and Nautilus are on that list on purpose: they&rsquo;re the connective
tissue, the feeds where &ldquo;huh, that&rsquo;s weirdly similar to&hellip;&rdquo; happens before my
vault even gets involved.</p>
<p>And its output contract is the <strong>opposite</strong> of the Magpie&rsquo;s. The Blue Jay is a
<strong>serendipity filter</strong>. Its job is to surface the connection that <em>isn&rsquo;t</em> in my
projects yet — the distant idea, the acorn worth burying. If I ran it through the
Magpie&rsquo;s &ldquo;only fire on a live-work hit&rdquo; rule, I would strangle the one thing it
exists to do. Relevance and serendipity pull in opposite directions, and you
can&rsquo;t tune a single agent to maximize both.</p>
<p>One more load-bearing detail, half design and half security: the Blue Jay
<strong>collides on the RSS summary only</strong> — title, abstract, link. It never pulls the
full article body into context. That&rsquo;s simultaneously the lower-injection path
<em>and</em> the right cognitive shape (a headline is a seed; I click through myself from
quarantine if the seed is interesting). The narrow input is doing double duty.</p>
<h2 id="why-two-birds-and-not-one-with-a-flag">Why two birds and not one with a flag</h2>
<p>I genuinely considered making this one agent with a <code>--mode=relevance|serendipity</code>
switch. I&rsquo;m glad I didn&rsquo;t, and the reasoning generalizes past birds:</p>
<table>
	<thead>
			<tr>
					<th></th>
					<th><strong>Magpie</strong></th>
					<th><strong>Blue Jay</strong></th>
			</tr>
	</thead>
	<tbody>
			<tr>
					<td>Source</td>
					<td>my GitHub stars (structured API)</td>
					<td>8 RSS feeds (open prose)</td>
			</tr>
			<tr>
					<td>Injection risk</td>
					<td>low</td>
					<td>the highest frontier</td>
			</tr>
			<tr>
					<td>Fires when</td>
					<td>a star hits <strong>live work</strong></td>
					<td>a summary sparks a <strong>distant</strong> idea</td>
			</tr>
			<tr>
					<td>Output</td>
					<td>relevance: repo → project → so-what</td>
					<td>serendipity: the not-yet-relevant connection</td>
			</tr>
			<tr>
					<td>Failure mode it guards against</td>
					<td>noise / false relevance</td>
					<td>being strangled into silence</td>
			</tr>
	</tbody>
</table>
<p>Two things made the split non-negotiable. First, <strong>the output contracts are too
different to share one brain</strong> — &ldquo;only speak on a hit&rdquo; and &ldquo;speak about the thing
that isn&rsquo;t a hit yet&rdquo; are contradictory prompts, and a single agent told to do
both does neither well. Second, <strong>open news is a higher injection frontier than a
structured stars API</strong>, so the riskier bird deserves its own enforced blast-radius
wrapper, not a code path bolted onto the safe one. When two jobs disagree on both
<em>what good output is</em> and <em>how dangerous the input is</em>, that&rsquo;s not a flag. That&rsquo;s
two programs.</p>
<p>So now my vault has two more agents reading the world on a cron. The Magpie runs
Saturday at 06:00 and tells me when something I bookmarked finally became
relevant. The Blue Jay runs Saturday at 07:00 and buries acorns in a quarantine
folder, most of which I&rsquo;ll ignore — but I only need one of them to grow into an
oak.</p>
<p>Both are on probation for their first few runs, because I don&rsquo;t trust a thing that
reads the internet until I&rsquo;ve watched it behave. But the part I&rsquo;m actually happy
about isn&rsquo;t the agents. It&rsquo;s that building the <em>second</em> one forced me to say out
loud what the first one was secretly assuming — and the names made the difference
impossible to forget. A magpie hoards. A blue jay scatters. You want both, and
you do not want them to be the same bird.</p>
]]></content:encoded></item><item><title>Is Anyone Knocking? A Security Pass on My Homelab</title><link>https://blog.hippotion.com/posts/is-anyone-knocking/</link><pubDate>Fri, 22 May 2026 00:00:00 +0000</pubDate><guid>https://blog.hippotion.com/posts/is-anyone-knocking/</guid><description>I set out to answer a simple worry — is someone trying to get into my server? — and found the scarier question underneath it: if they did, would I even know? My front door was solid. The inside had an alarm with the wires cut, a web terminal sitting on the open internet, and no floor under the blast radius. Here&amp;rsquo;s the audit, and the three things I fixed.</description><content:encoded><![CDATA[<h2 id="the-question-i-actually-had">The question I actually had</h2>
<p>It started as a nervous-Sunday kind of question: <em>is a third party trying to
get into my server — over SSH, or some other way?</em> I run a single-node
Kubernetes homelab that hosts a couple dozen little apps, some of them public.
You read about credential-stuffing bots and you start to wonder who&rsquo;s been
rattling the handle while you slept.</p>
<p>So I did the audit. The good news came first, and it&rsquo;s worth saying plainly
because it&rsquo;s the part most homelabs get wrong: <strong>the front door is solid.</strong>
Nothing is reachable from the internet except through a Cloudflare Tunnel —
an outbound-only connection, zero open inbound ports on my router. Almost
every service sits behind OAuth. The cluster has 140 network policies doing
real east-west segmentation. And the login history? Eleven straight weeks
where every single shell login came from one IP — my own workstation on the
LAN. No strangers. No 3 a.m. logins from a VPS in another hemisphere.</p>
<p>I could have stopped there feeling good. That would have been a mistake.</p>
<h2 id="the-scary-finding-wasnt-an-attacker">The scary finding wasn&rsquo;t an attacker</h2>
<p>The useful question turned out not to be <em>&ldquo;is someone knocking?&rdquo;</em> but
<em>&ldquo;if someone got in, would anything tell me?&rdquo;</em> And when I traced that wire,
it ended in the dark.</p>
<p>I have a full monitoring stack — Prometheus, Grafana, Alertmanager, the works.
Alertmanager was running. It was also configured to notify exactly <strong>no one</strong>:
no receivers, and upstream, <strong>no alert rules at all</strong>. It was a smoke detector
with the battery taken out and, for good measure, no smoke sensor either. If an
attacker had walked in, the alarm would have stayed perfectly, silently green.</p>
<p>That reframed the whole job. Three gaps, in priority order.</p>
<h2 id="gap-1--an-alarm-with-no-one-to-call">Gap 1 — an alarm with no one to call</h2>
<p>I built the missing chain end to end. A small exporter on the host parses the
SSH journal and <code>fail2ban</code> state and writes metrics into node_exporter&rsquo;s
textfile collector — so it rides the monitoring I already had instead of adding
a new moving part. On top sit the alert rules that were never there. The one
that matters most is blunt:</p>
<blockquote>
<p><strong>A shell login succeeded from a non-LAN IP.</strong></p>
</blockquote>
<p>That should be impossible in normal life, so if it ever fires, I want it
shouting. It now emails me the instant it happens, alongside quieter alerts for
brute-force spikes, distributed scans, <code>fail2ban</code> going down, and — the
meta-alert I&rsquo;m fondest of — <em>the watchdog itself going stale</em>, because a
security monitor that silently dies is worse than none. And <code>fail2ban</code> now
actually bans the bots, with escalating ban times and my LAN permanently on the
allow-list.</p>
<p>The honest lesson: I&rsquo;d been treating &ldquo;I have Prometheus&rdquo; as if it meant &ldquo;I have
monitoring.&rdquo; Dashboards you have to remember to look at are not monitoring.
<strong>Monitoring is the thing that interrupts you.</strong> Until an alert can reach your
phone, you don&rsquo;t have a security alarm — you have a security <em>museum</em>.</p>
<h2 id="gap-2--there-was-a-web-terminal-on-the-open-internet">Gap 2 — there was a web terminal on the open internet</h2>
<p>This is the one that made me wince. Among my public hostnames was <code>ttyd</code> — a
browser-based shell. A full terminal on my server, reachable from anywhere,
sitting behind a single OAuth proxy. One misconfiguration, one OAuth bypass,
and that&rsquo;s not &ldquo;an app is compromised,&rdquo; that&rsquo;s <em>root on the box from a browser
tab.</em></p>
<p>The fix here isn&rsquo;t more locks. It&rsquo;s the realization that <strong>the strongest
control is not exposing the thing at all.</strong> I deleted the web terminal
entirely — app, manifests, dashboard tile, all of it. Then I went down the
public hostname list and pulled everything with no business being public off
the tunnel: the secrets UI, the ingress dashboard, Prometheus, Alertmanager,
the network-observability console, the DNS admin. They still work — on my LAN,
over the same wildcard cert — they&rsquo;re just not the internet&rsquo;s business anymore.
A service that isn&rsquo;t exposed has no attack surface to harden.</p>
<h2 id="gap-3--no-floor-under-the-blast-radius">Gap 3 — no floor under the blast radius</h2>
<p>The network policies limit how far a compromised pod can talk sideways. But
nothing stopped a workload from running as root, mounting the host filesystem,
or grabbing the host network in the first place. So I turned on Kubernetes'
built-in Pod Security Admission: every namespace now at least <em>reports</em>
baseline violations, and the clean app namespaces <em>enforce</em> baseline —
meaning a compromised app there simply cannot request privileged mode or a
hostPath mount. It&rsquo;s a floor. Floors are underrated.</p>
<h2 id="what-the-audit-was-really-about">What the audit was really about</h2>
<p>I went looking for an intruder and didn&rsquo;t find one — the logs were clean, the
front door held. What I found instead was that I&rsquo;d built something secure at
the perimeter and then never asked the uncomfortable follow-up: <em>what happens
after the perimeter?</em> The answer had been &ldquo;nothing happens, and no one is
told,&rdquo; and I just hadn&rsquo;t looked.</p>
<p>Three principles I&rsquo;m taking with me:</p>
<ul>
<li><strong>An alarm that can&rsquo;t reach you is decoration.</strong> Wire the notification first;
the rules are easy once something is listening.</li>
<li><strong>Don&rsquo;t expose it beats add more auth.</strong> Every hostname you take off the
public internet is a class of attack you no longer have to be clever about.</li>
<li><strong>Give the blast radius a floor.</strong> Assume one thing gets popped, and decide
in advance how far it gets.</li>
</ul>
<p>The best part: all of it is GitOps. The intrusion alerts, the un-exposing, the
pod-security floor — every change is a commit, reviewable and revertible, and
my cluster reconciles itself to match. The audit didn&rsquo;t just make the homelab
safer. It wrote down <em>why</em> it&rsquo;s safer, in a form the next version of me can
read.</p>
<p>Now if someone knocks, I&rsquo;ll know. And the web terminal isn&rsquo;t answering the
door anymore — because it&rsquo;s gone.</p>
]]></content:encoded></item><item><title>🚩 I Built a Usage Dashboard and Tripped Claude Fable 5's Safety Net</title><link>https://blog.hippotion.com/posts/when-claude-flagged-my-own-dashboard/</link><pubDate>Fri, 24 Apr 2026 00:00:00 +0000</pubDate><guid>https://blog.hippotion.com/posts/when-claude-flagged-my-own-dashboard/</guid><description>I asked Claude Fable 5 to help me self-host a dashboard for my own Claude usage. Halfway through, its dual-use safety measures flagged the conversation and downshifted me to Opus 4.8. Nothing I did was wrong — the request just had the shape of something that is. That gap, between what a thing looks like and what it&amp;rsquo;s for, turns out to be the whole story.</description><content:encoded><![CDATA[<h2 id="the-thing-i-was-actually-building">The thing I was actually building</h2>
<p>I wanted a small web page on my homelab that shows my Claude usage — the 5-hour
session window, the weekly limits, the per-model split. There&rsquo;s a nice Electron
widget out there that does this on the desktop, but I don&rsquo;t want a desktop app; I
want a URL behind my own OAuth that I can glance at from my phone.</p>
<p>The mechanics are unremarkable. The claude.ai web app reads those numbers from a
couple of undocumented endpoints using your logged-in session cookie. So a
self-hosted version does the same thing server-side: hold the session token as a
secret, replay the same calls, cache the result, render some bars. An afternoon&rsquo;s
work. I was pairing with <strong>Claude Fable 5</strong> on it — Anthropic&rsquo;s newest model, and
the one that ships with extra safety measures around dual-use capability.</p>
<p>Then, partway through, I got the message: <em>Fable 5 flagged something in this
session and switched to a more conservative model.</em> It dropped me to <strong>Opus 4.8</strong>
for the rest of the conversation. Safe conversations sometimes trip it, the notice
said. Send feedback.</p>
<h2 id="i-wasnt-doing-anything-wrong-thats-the-interesting-part">I wasn&rsquo;t doing anything wrong. That&rsquo;s the interesting part.</h2>
<p>My first reaction was the obvious one — <em>what did I say?</em> But I knew exactly what
I&rsquo;d built, and none of it was sketchy. It was my account, my usage data, my
hardware, my OAuth in front of it.</p>
<p>So I went looking at the request the way a classifier would — not &ldquo;what did he
mean&rdquo; but &ldquo;what does this look like.&rdquo; And from that angle it&rsquo;s a different
picture entirely. Stack up the surface features:</p>
<ul>
<li>🔑 capturing a <strong>session token</strong> and storing it to replay later</li>
<li>🌐 sending it to an <strong>undocumented API</strong> that isn&rsquo;t meant for third parties</li>
<li>🕵️ spoofing a <strong>browser User-Agent</strong> so the request blends in</li>
<li>🧱 detecting and working around a <strong>Cloudflare bot challenge</strong></li>
</ul>
<p>Read that list cold, with no context. That&rsquo;s not a usage dashboard. That&rsquo;s the
exact signature of credential theft and scraping tooling. Every individual move
is one a malicious script would also make. The only thing separating my afternoon
project from the bad version is <em>whose</em> account it touches and <em>why</em> — and intent
is precisely the part that doesn&rsquo;t show up in the tokens.</p>
<h2 id="surface-vs-intent">Surface vs. intent</h2>
<p>This is the part worth sitting with, because it&rsquo;s not a Claude quirk — it&rsquo;s the
shape of every content classifier, every WAF rule, every fraud model I&rsquo;ve ever
run in production.</p>
<p>A detector scores what it can see. It cannot see intent; it sees features. And
the features of &ldquo;monitor my own usage&rdquo; and &ldquo;harvest someone else&rsquo;s session&rdquo;
overlap almost completely, because the <em>technique</em> is identical — the difference
lives entirely in context the model has been deliberately built not to over-trust.
You can&rsquo;t tune that gap away. You can only pick where to sit on the
precision/recall curve, and Fable 5 — being the high-capability model with the
extra dual-use measures bolted on — sits where it catches the pattern even when it
costs some false positives, then hands off to Opus 4.8. I was the false positive.
The system did roughly the right thing for roughly the right reason; it just
doesn&rsquo;t feel that way when it&rsquo;s pointed at you.</p>
<p>The honest engineering takeaway is the one I keep relearning: <strong>if a benign task
has the silhouette of an abusive one, expect to get treated like the silhouette.</strong>
Not just by AI — by rate limiters, by bot detection, by the fraud team. The fix
isn&rsquo;t to be offended. It&rsquo;s to recognize the silhouette, and where it matters,
make the legitimate context legible up front.</p>
<h2 id="what-id-do-differently">What I&rsquo;d do differently</h2>
<p>Practically, very little — the project was fine, and it downshifted to a model
that finished the job. But the framing changed how I built it. I leaned harder
into the parts that make intent <em>visible in the design</em>: the session token never
leaves the server, it lives in Vault and arrives as an injected secret, the whole
thing sits behind OAuth, and it polls on a leash instead of hammering. Not because
a classifier made me, but because those are the same choices that make it
obviously a personal dashboard and not a harvesting bot — to a reviewer, to
future-me, and yes, to a model reading over my shoulder.</p>
<p>The widget rides your credential on your desktop. Mine keeps it server-side behind
my own front door. Turns out building it the trustworthy way and building it the
<em>legibly</em> trustworthy way are the same work — and getting flagged is what made me
notice the difference.</p>
]]></content:encoded></item><item><title>🔒 Building a PII Guardrail Proxy for Cloud LLM Calls</title><link>https://blog.hippotion.com/posts/ai-pii-guardrail-proxy/</link><pubDate>Fri, 26 Sep 2025 00:00:00 +0000</pubDate><guid>https://blog.hippotion.com/posts/ai-pii-guardrail-proxy/</guid><description>A local model classifies every prompt before it leaves the cluster. If it&amp;rsquo;s sensitive, it&amp;rsquo;s blocked. If it&amp;rsquo;s clean, it goes to NVIDIA NIM. 150 lines of FastAPI, deployed on k3s.</description><content:encoded><![CDATA[<h2 id="the-problem-with-cloud-llm-access">The problem with cloud LLM access</h2>
<p>Running a local model is great for privacy. But local models hit a ceiling — for the heavy lifting, you want a cloud API like NVIDIA NIM with Llama 3.3 70B.</p>
<p>The moment you open that channel, you have a new risk: what if someone (or some automation) accidentally pastes a password, a private key, or someone&rsquo;s personal data into the chat? It leaves the cluster. It&rsquo;s logged somewhere you don&rsquo;t control.</p>
<p>The standard answer is &ldquo;train your users.&rdquo; I&rsquo;d rather have a technical control.</p>
<h2 id="the-architecture">The architecture</h2>
<pre tabindex="0"><code>Open WebUI → ai-guard proxy
                 │
        ┌────────┴────────┐
        │                 │
  llama-server       if SAFE:
  (classify)         forward to NVIDIA NIM
        │
   if SENSITIVE:
   block + explain
</code></pre><p>Every request to NVIDIA NIM goes through ai-guard first. ai-guard pulls the user message, sends it to the local llama.cpp server with a classification prompt, and makes a binary decision:</p>
<ul>
<li><code>SAFE</code> → forward to NVIDIA NIM with the real API key (which ai-guard holds, not the client)</li>
<li><code>SENSITIVE: &lt;reason&gt;</code> → return HTTP 400, log the block, nothing leaves the cluster</li>
</ul>
<p>The local model is already running for inference — this reuses it as a privacy gatekeeper at zero extra infrastructure cost.</p>
<h2 id="the-implementation">The implementation</h2>
<p>The proxy is ~150 lines of FastAPI. The classifier call:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="n">CLASSIFIER_PROMPT</span> <span class="o">=</span> <span class="s2">&#34;&#34;&#34;You are a data security classifier. Check if the text below contains sensitive information:
</span></span></span><span class="line"><span class="cl"><span class="s2">passwords, API keys, tokens, credentials, personal identifiable information (names, emails, phone numbers, SSNs, addresses), financial data (card numbers, bank accounts), or private keys.
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">Reply with ONLY one of:
</span></span></span><span class="line"><span class="cl"><span class="s2">SAFE
</span></span></span><span class="line"><span class="cl"><span class="s2">SENSITIVE: &lt;one-line reason&gt;
</span></span></span><span class="line"><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="cl"><span class="s2">Text to check:
</span></span></span><span class="line"><span class="cl"><span class="s2">&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">async</span> <span class="k">def</span> <span class="nf">classify</span><span class="p">(</span><span class="n">text</span><span class="p">:</span> <span class="nb">str</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">tuple</span><span class="p">[</span><span class="nb">bool</span><span class="p">,</span> <span class="nb">str</span><span class="p">]:</span>
</span></span><span class="line"><span class="cl">    <span class="k">async</span> <span class="k">with</span> <span class="n">httpx</span><span class="o">.</span><span class="n">AsyncClient</span><span class="p">(</span><span class="n">timeout</span><span class="o">=</span><span class="mi">60</span><span class="p">)</span> <span class="k">as</span> <span class="n">client</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">resp</span> <span class="o">=</span> <span class="k">await</span> <span class="n">client</span><span class="o">.</span><span class="n">post</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">            <span class="sa">f</span><span class="s2">&#34;</span><span class="si">{</span><span class="n">LLAMA_BASE</span><span class="si">}</span><span class="s2">/chat/completions&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">json</span><span class="o">=</span><span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;model&#34;</span><span class="p">:</span> <span class="s2">&#34;phi-3.5-mini&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;messages&#34;</span><span class="p">:</span> <span class="p">[{</span><span class="s2">&#34;role&#34;</span><span class="p">:</span> <span class="s2">&#34;user&#34;</span><span class="p">,</span> <span class="s2">&#34;content&#34;</span><span class="p">:</span> <span class="n">CLASSIFIER_PROMPT</span> <span class="o">+</span> <span class="n">text</span><span class="p">[:</span><span class="mi">3000</span><span class="p">]}],</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;max_tokens&#34;</span><span class="p">:</span> <span class="mi">30</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;temperature&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;stream&#34;</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="p">},</span>
</span></span><span class="line"><span class="cl">            <span class="n">headers</span><span class="o">=</span><span class="p">{</span><span class="s2">&#34;Authorization&#34;</span><span class="p">:</span> <span class="s2">&#34;Bearer sk-no-key&#34;</span><span class="p">},</span>
</span></span><span class="line"><span class="cl">        <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">answer</span> <span class="o">=</span> <span class="n">resp</span><span class="o">.</span><span class="n">json</span><span class="p">()[</span><span class="s2">&#34;choices&#34;</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s2">&#34;message&#34;</span><span class="p">][</span><span class="s2">&#34;content&#34;</span><span class="p">]</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="n">answer</span><span class="o">.</span><span class="n">upper</span><span class="p">()</span><span class="o">.</span><span class="n">startswith</span><span class="p">(</span><span class="s2">&#34;SENSITIVE&#34;</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">reason</span> <span class="o">=</span> <span class="n">answer</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s2">&#34;:&#34;</span><span class="p">,</span> <span class="mi">1</span><span class="p">)[</span><span class="mi">1</span><span class="p">]</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span> <span class="k">if</span> <span class="s2">&#34;:&#34;</span> <span class="ow">in</span> <span class="n">answer</span> <span class="k">else</span> <span class="s2">&#34;sensitive content detected&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="kc">True</span><span class="p">,</span> <span class="n">reason</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="kc">False</span><span class="p">,</span> <span class="s2">&#34;&#34;</span>
</span></span></code></pre></div><p><code>temperature=0</code> and <code>max_tokens=30</code> keep the response deterministic and fast. The model only needs to output one word or one line.</p>
<p>The main handler:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="nd">@app.post</span><span class="p">(</span><span class="s2">&#34;/v1/chat/completions&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="k">async</span> <span class="k">def</span> <span class="nf">proxy_chat</span><span class="p">(</span><span class="n">request</span><span class="p">:</span> <span class="n">Request</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">body</span> <span class="o">=</span> <span class="k">await</span> <span class="n">request</span><span class="o">.</span><span class="n">json</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="n">user_text</span> <span class="o">=</span> <span class="n">extract_user_text</span><span class="p">(</span><span class="n">body</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">&#34;messages&#34;</span><span class="p">,</span> <span class="p">[]))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="n">user_text</span><span class="o">.</span><span class="n">strip</span><span class="p">():</span>
</span></span><span class="line"><span class="cl">        <span class="k">try</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">is_sensitive</span><span class="p">,</span> <span class="n">reason</span> <span class="o">=</span> <span class="k">await</span> <span class="n">classify</span><span class="p">(</span><span class="n">user_text</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">exc</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">log</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="s2">&#34;classifier error: </span><span class="si">%s</span><span class="s2"> — allowing request through&#34;</span><span class="p">,</span> <span class="n">exc</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="n">is_sensitive</span> <span class="o">=</span> <span class="kc">False</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">is_sensitive</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="n">JSONResponse</span><span class="p">(</span><span class="n">status_code</span><span class="o">=</span><span class="mi">400</span><span class="p">,</span> <span class="n">content</span><span class="o">=</span><span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;error&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;message&#34;</span><span class="p">:</span> <span class="sa">f</span><span class="s2">&#34;Request blocked by ai-guard: </span><span class="si">{</span><span class="n">reason</span><span class="si">}</span><span class="s2">. Remove sensitive content before sending to external models.&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;type&#34;</span><span class="p">:</span> <span class="s2">&#34;content_policy_violation&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                <span class="p">}</span>
</span></span><span class="line"><span class="cl">            <span class="p">})</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># Safe — forward to upstream with streaming support</span>
</span></span><span class="line"><span class="cl">    <span class="o">...</span>
</span></span></code></pre></div><p>Fail-open: if the classifier itself errors (llama-server down, timeout), the request goes through and the error is logged. Fail-closed would be safer for high-stakes environments, but this is a homelab and I&rsquo;d rather not block all cloud LLM access because the local model is warming up.</p>
<h2 id="kubernetes-deployment">Kubernetes deployment</h2>
<p>ai-guard runs in the same namespace as llama-server and Open WebUI (<code>web-ai-engine</code>). Intra-namespace traffic is always allowed in Cilium, so no new network policy needed.</p>
<p>Open WebUI uses semicolon-separated lists for multiple API backends:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl">- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">OPENAI_API_BASE_URLS</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">value</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;http://llama-server.web-ai-engine.svc:8080/v1;http://ai-guard.web-ai-engine.svc:8080/v1&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl">- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">OPENAI_API_KEYS</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">value</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;sk-no-key;sk-no-key&#34;</span><span class="w">
</span></span></span></code></pre></div><p>The second entry is ai-guard. Open WebUI passes <code>sk-no-key</code> as the API key — ai-guard ignores it and uses its own <code>UPSTREAM_API_KEY</code> from a Kubernetes Secret (pulled from Vault via External Secrets Operator). The real NVIDIA API key never touches the client.</p>
<h2 id="the-latency-tradeoff">The latency tradeoff</h2>
<p>The classification step adds 5–15 seconds on CPU inference. That&rsquo;s the cost of keeping the check fully private — the classifier never sends data anywhere.</p>
<p>For a personal homelab assistant, this is fine. For a high-throughput production setup, you&rsquo;d want the classifier on a GPU or a dedicated smaller model purpose-built for classification.</p>
<h2 id="what-it-catches">What it catches</h2>
<p>The classifier prompt targets:</p>
<ul>
<li>Passwords, API keys, tokens, credentials</li>
<li>PII: names, emails, phone numbers, SSNs, addresses</li>
<li>Financial data: card numbers, bank accounts</li>
<li>Private keys</li>
</ul>
<p>False negatives are possible — no classifier is perfect. This is a first line of defense, not a compliance control. The value is catching the obvious, accidental leaks.</p>
<h2 id="source">Source</h2>
<p><a href="https://github.com/janos-gyorgy/ai-guard">github.com/janos-gyorgy/ai-guard</a> — MIT licensed, Kubernetes manifests included.</p>
]]></content:encoded></item><item><title>🕵️ Privacy-Preserving LLM Pipelines: Anonymize Before You Send</title><link>https://blog.hippotion.com/posts/llm-anonymizer-privacy-pipeline/</link><pubDate>Fri, 12 Sep 2025 00:00:00 +0000</pubDate><guid>https://blog.hippotion.com/posts/llm-anonymizer-privacy-pipeline/</guid><description>Replace PII with semantically realistic fakes before sending to a cloud LLM, then restore the originals from the response. Started with a general model and prompt engineering — then upgraded to a purpose-built 1.7B fine-tune via Ollama.</description><content:encoded><![CDATA[<h2 id="the-problem-with-blocking">The problem with blocking</h2>
<p>The <a href="/posts/ai-pii-guardrail-proxy/">PII guardrail proxy I built last week</a> works by classifying prompts and blocking the sensitive ones. That&rsquo;s fine for a chat interface where a human can rephrase. It doesn&rsquo;t work for automated pipelines.</p>
<p>If a Jira ticket contains someone&rsquo;s name and an internal hostname, you don&rsquo;t want the agent to fail — you want it to process the ticket without exposing that data. Blocking is the wrong primitive for pipelines. Anonymization is the right one.</p>
<h2 id="the-pattern">The pattern</h2>
<pre tabindex="0"><code>Input text
  → anonymizer: extract PII, replace with semantic fakes
  → &#34;Nathan Chen from DataSoft LLC needs ProjectX fixed on dev.internal.net&#34;
  + mapping: {&#34;Nathan Chen&#34; → &#34;John Smith&#34;, &#34;DataSoft LLC&#34; → &#34;ACME&#34;, ...}
  → cloud LLM: processes coherent text, never sees real values
  → &#34;Nathan Chen should check the ProjectX docs with the DataSoft LLC team&#34;
  → string substitution with reverse mapping
  → &#34;John Smith should check the OAuth docs with the ACME team&#34;
</code></pre><p>Two things that make this work:</p>
<p><strong>Deanonymization needs no LLM.</strong> Once you have the mapping, restoring is pure string substitution. The model call only happens on the way in.</p>
<p><strong>Semantic fakes beat placeholder tokens.</strong> An earlier version of this used <code>[PERSON_1]</code>, <code>[ORG_1]</code> tokens. The problem: cloud models see bracketed text and subtly change behaviour — shorter responses, hedging, dropped context. When the cloud model sees <code>Nathan Chen from DataSoft LLC</code>, it treats it as real text and responds naturally. Quality is noticeably better.</p>
<h2 id="prior-art--what-already-exists">Prior art — what already exists</h2>
<p>This is a well-established pattern. Worth knowing what&rsquo;s out there:</p>
<p><strong><a href="https://llm-guard.com/output_scanners/deanonymize/">LLM Guard</a></strong> (Protect AI) — the most complete open-source implementation. Anonymize + Deanonymize scanner pair with a Vault for the mapping. Production-grade, actively maintained. Start here if you&rsquo;re building this for anything serious.</p>
<p><strong><a href="https://techcommunity.microsoft.com/blog/azuredevcommunityblog/introducing-pii-shield-a-privacy-proxy-for-every-llm-call/4514726">Microsoft PII Shield</a></strong> — session-based proxy. Returns a session ID with the anonymized text, uses it to deanonymize the response.</p>
<p><strong><a href="https://github.com/fsndzomga/anonLLM">anonLLM</a></strong> — uses GLiNER (a proper NER model) + Faker for realistic replacements. Better accuracy than a general chat model.</p>
<p><strong><a href="https://ieeexplore.ieee.org/document/11140717/">REDACT</a></strong> — IEEE paper describing a system using Ollama for PII redaction in documents.</p>
<p><strong><a href="https://huggingface.co/blog/pratyushrt/anonymizerslm">HuggingFace Anonymizer SLM series</a></strong> — purpose-built models (0.6B/1.7B/4B) fine-tuned specifically for anonymization. 9.20/10 quality score for 1.7B, close to GPT-4.1&rsquo;s 9.77.</p>
<p>That last one is what this implementation actually uses.</p>
<h2 id="the-model-anonymizer-17b">The model: Anonymizer-1.7B</h2>
<p><a href="https://huggingface.co/eternisai/Anonymizer-1.7B">eternisai/Anonymizer-1.7B</a> is a Qwen3-1.7B fine-tune trained on ~30k anonymization samples using GRPO with GPT-4.1 as judge. It outputs structured tool calls instead of free text:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;name&#34;</span><span class="p">:</span> <span class="s2">&#34;replace_entities&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;arguments&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;replacements&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span><span class="nt">&#34;original&#34;</span><span class="p">:</span> <span class="s2">&#34;John Smith&#34;</span><span class="p">,</span> <span class="nt">&#34;replacement&#34;</span><span class="p">:</span> <span class="s2">&#34;Nathan Chen&#34;</span><span class="p">},</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span><span class="nt">&#34;original&#34;</span><span class="p">:</span> <span class="s2">&#34;ACME Corp&#34;</span><span class="p">,</span> <span class="nt">&#34;replacement&#34;</span><span class="p">:</span> <span class="s2">&#34;DataSoft LLC&#34;</span><span class="p">},</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span><span class="nt">&#34;original&#34;</span><span class="p">:</span> <span class="s2">&#34;auth.acme.internal&#34;</span><span class="p">,</span> <span class="nt">&#34;replacement&#34;</span><span class="p">:</span> <span class="s2">&#34;dev.internal.net&#34;</span><span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">]</span>
</span></span><span class="line"><span class="cl">  <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></div><p>No prompt engineering needed. The model knows exactly what it&rsquo;s doing and outputs a structured contract. Compare that to the first version of this service, which sent a long JSON-format prompt to Phi-3.5-mini and hoped the output parsed correctly.</p>
<p>The model runs via Ollama (which handles the Qwen3 chat template and tool calling natively), pointed at the GGUF version from HuggingFace: <code>hf.co/gabriellarson/Anonymizer-1.7B-GGUF</code>.</p>
<h2 id="the-implementation">The implementation</h2>
<p><code>llm-anonymizer</code> is a FastAPI service with two endpoints.</p>
<p><strong><code>POST /anonymize</code></strong> — calls Ollama with the tool definition, parses the response:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="n">TOOLS</span> <span class="o">=</span> <span class="p">[{</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;type&#34;</span><span class="p">:</span> <span class="s2">&#34;function&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;function&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;name&#34;</span><span class="p">:</span> <span class="s2">&#34;replace_entities&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;description&#34;</span><span class="p">:</span> <span class="s2">&#34;Replace PII entities with anonymized versions&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;parameters&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;type&#34;</span><span class="p">:</span> <span class="s2">&#34;object&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;properties&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;replacements&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;type&#34;</span><span class="p">:</span> <span class="s2">&#34;array&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;items&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">                        <span class="s2">&#34;type&#34;</span><span class="p">:</span> <span class="s2">&#34;object&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                        <span class="s2">&#34;properties&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">                            <span class="s2">&#34;original&#34;</span><span class="p">:</span> <span class="p">{</span><span class="s2">&#34;type&#34;</span><span class="p">:</span> <span class="s2">&#34;string&#34;</span><span class="p">},</span>
</span></span><span class="line"><span class="cl">                            <span class="s2">&#34;replacement&#34;</span><span class="p">:</span> <span class="p">{</span><span class="s2">&#34;type&#34;</span><span class="p">:</span> <span class="s2">&#34;string&#34;</span><span class="p">},</span>
</span></span><span class="line"><span class="cl">                        <span class="p">},</span>
</span></span><span class="line"><span class="cl">                        <span class="s2">&#34;required&#34;</span><span class="p">:</span> <span class="p">[</span><span class="s2">&#34;original&#34;</span><span class="p">,</span> <span class="s2">&#34;replacement&#34;</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">                    <span class="p">},</span>
</span></span><span class="line"><span class="cl">                <span class="p">}</span>
</span></span><span class="line"><span class="cl">            <span class="p">},</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;required&#34;</span><span class="p">:</span> <span class="p">[</span><span class="s2">&#34;replacements&#34;</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl"><span class="p">}]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">resp</span> <span class="o">=</span> <span class="k">await</span> <span class="n">client</span><span class="o">.</span><span class="n">post</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;</span><span class="si">{</span><span class="n">OLLAMA_BASE</span><span class="si">}</span><span class="s2">/api/chat&#34;</span><span class="p">,</span> <span class="n">json</span><span class="o">=</span><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;model&#34;</span><span class="p">:</span> <span class="n">MODEL</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;messages&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span><span class="s2">&#34;role&#34;</span><span class="p">:</span> <span class="s2">&#34;system&#34;</span><span class="p">,</span> <span class="s2">&#34;content&#34;</span><span class="p">:</span> <span class="n">SYSTEM_PROMPT</span><span class="p">},</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span><span class="s2">&#34;role&#34;</span><span class="p">:</span> <span class="s2">&#34;user&#34;</span><span class="p">,</span> <span class="s2">&#34;content&#34;</span><span class="p">:</span> <span class="n">text</span> <span class="o">+</span> <span class="s2">&#34;</span><span class="se">\n</span><span class="s2">/no_think&#34;</span><span class="p">},</span>  <span class="c1"># skip Qwen3 thinking mode</span>
</span></span><span class="line"><span class="cl">    <span class="p">],</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;tools&#34;</span><span class="p">:</span> <span class="n">TOOLS</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;stream&#34;</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="p">})</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">tool_calls</span> <span class="o">=</span> <span class="n">resp</span><span class="o">.</span><span class="n">json</span><span class="p">()[</span><span class="s2">&#34;message&#34;</span><span class="p">][</span><span class="s2">&#34;tool_calls&#34;</span><span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="n">replacements</span> <span class="o">=</span> <span class="n">tool_calls</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="s2">&#34;function&#34;</span><span class="p">][</span><span class="s2">&#34;arguments&#34;</span><span class="p">][</span><span class="s2">&#34;replacements&#34;</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Build reverse mapping: replacement → original (for deanonymization)</span>
</span></span><span class="line"><span class="cl"><span class="n">anonymized</span> <span class="o">=</span> <span class="n">text</span>
</span></span><span class="line"><span class="cl"><span class="n">mapping</span> <span class="o">=</span> <span class="p">{}</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">pair</span> <span class="ow">in</span> <span class="n">replacements</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">anonymized</span> <span class="o">=</span> <span class="n">anonymized</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="n">pair</span><span class="p">[</span><span class="s2">&#34;original&#34;</span><span class="p">],</span> <span class="n">pair</span><span class="p">[</span><span class="s2">&#34;replacement&#34;</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">    <span class="n">mapping</span><span class="p">[</span><span class="n">pair</span><span class="p">[</span><span class="s2">&#34;replacement&#34;</span><span class="p">]]</span> <span class="o">=</span> <span class="n">pair</span><span class="p">[</span><span class="s2">&#34;original&#34;</span><span class="p">]</span>
</span></span></code></pre></div><p>The <code>/no_think</code> suffix tells the model to skip its chain-of-thought — faster response, same accuracy for this task.</p>
<p><strong><code>POST /deanonymize</code></strong> — no model call, just substitution:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="k">for</span> <span class="n">replacement</span><span class="p">,</span> <span class="n">original</span> <span class="ow">in</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">mapping</span><span class="o">.</span><span class="n">items</span><span class="p">(),</span> <span class="n">key</span><span class="o">=</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="nb">len</span><span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="mi">0</span><span class="p">]),</span> <span class="n">reverse</span><span class="o">=</span><span class="kc">True</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">    <span class="n">text</span> <span class="o">=</span> <span class="n">text</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="n">replacement</span><span class="p">,</span> <span class="n">original</span><span class="p">)</span>
</span></span></code></pre></div><p>Sorted by length descending so longer tokens don&rsquo;t get partially overwritten by shorter ones.</p>
<h2 id="the-kubernetes-stack">The Kubernetes stack</h2>
<p>Ollama runs as a separate deployment in the same namespace as everything else (<code>web-ai-engine</code>). Intra-namespace traffic is always allowed — no new network policies.</p>
<pre tabindex="0"><code>llm-anonymizer (FastAPI) → Ollama (port 11434) → Anonymizer-1.7B GGUF
</code></pre><p>One-time model pull after first deploy:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">kubectl <span class="nb">exec</span> -n web-ai-engine deploy/ollama -- <span class="se">\
</span></span></span><span class="line"><span class="cl">  ollama pull hf.co/gabriellarson/Anonymizer-1.7B-GGUF
</span></span></code></pre></div><p>Ollama caches it on a 10Gi PVC, so pod restarts don&rsquo;t re-download.</p>
<h2 id="the-n8n-pipeline">The n8n pipeline</h2>
<p>Five-node chain triggered by webhook:</p>
<pre tabindex="0"><code>Webhook → /anonymize → NVIDIA NIM → /deanonymize → Respond
</code></pre><p>The NVIDIA NIM call includes a system prompt instructing it to treat the text as normal input. No mention of tokens, no special handling — because the text looks like real text.</p>
<p>Wire any upstream source to the webhook: Jira event, Slack slash command, a scheduled job that processes internal docs. The pipeline is source-agnostic.</p>
<h2 id="the-caveats">The caveats</h2>
<p><strong>1.7B isn&rsquo;t GPT-4.1.</strong> The model scores 9.20/10 on the benchmark — which means roughly 1 in 10 cases has a missed or incorrect entity. Test with real examples from your domain before depending on it.</p>
<p><strong>Deanonymization breaks on heavy rephrasing.</strong> If the cloud model restructures a sentence enough that the fake value no longer appears verbatim, the substitution silently misses it. The prompt helps but doesn&rsquo;t eliminate the risk.</p>
<p><strong>Ollama adds a deployment.</strong> It&rsquo;s ~500MB image + the model weights (~1GB Q4). On a constrained single-node cluster that&rsquo;s real overhead. llama-server already covers general chat; Ollama is purely for this model&rsquo;s tool-calling support.</p>
<h2 id="source">Source</h2>
<p><a href="https://github.com/janos-gyorgy/llm-anonymizer">github.com/janos-gyorgy/llm-anonymizer</a> — MIT licensed, Kubernetes manifests and n8n workflow included.</p>
]]></content:encoded></item><item><title>🔄 Someone kubectl apply'd a Hotfix Directly. How Do You Detect and Prevent It?</title><link>https://blog.hippotion.com/posts/k8s-config-drift/</link><pubDate>Fri, 06 Jun 2025 00:00:00 +0000</pubDate><guid>https://blog.hippotion.com/posts/k8s-config-drift/</guid><description>Manual kubectl in production is the Kubernetes equivalent of SSH&amp;rsquo;ing into a server and editing files. It works until it doesn&amp;rsquo;t, and when it doesn&amp;rsquo;t, nobody knows why.</description><content:encoded><![CDATA[<h2 id="the-question">The question</h2>
<p><em>&ldquo;How do you prevent configuration drift in a Kubernetes cluster?&rdquo;</em></p>
<p>Configuration drift: the cluster&rsquo;s actual state diverges from what&rsquo;s declared in your source of truth. Someone runs <code>kubectl edit deployment myapp</code> to bump a memory limit during an incident. Someone adds a debug sidecar directly. Someone applies a YAML file from their laptop that was never committed to Git. The fix works. It goes undocumented. Six months later, a new deployment overwrites it. The incident recurs.</p>
<p>There are two distinct problems here that require different solutions:</p>
<ol>
<li><strong>Detection and remediation</strong>: how do you notice drift and revert it?</li>
<li><strong>Prevention</strong>: how do you stop non-compliant resources from being created in the first place?</li>
</ol>
<hr>
<h2 id="detection-and-remediation-argo-cd-selfheal">Detection and remediation: Argo CD selfHeal</h2>
<p>If you&rsquo;re using GitOps with Argo CD, detection and remediation are handled for you:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">syncPolicy</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">automated</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">prune</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">selfHeal</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span></code></pre></div><p><code>selfHeal: true</code> means Argo CD continuously compares the cluster state to the Git repo and reverts any divergence. Someone runs <code>kubectl edit deployment myapp</code> and changes the replica count? Argo CD detects the diff on its next reconciliation cycle (default: every 3 minutes) and reverts it.</p>
<p><code>prune: true</code> means resources that exist in the cluster but not in Git are deleted. Someone <code>kubectl apply</code>&rsquo;d a debug pod directly? Gone on the next sync.</p>
<p>This is the audit trail story too. Every legitimate change is a Git commit with an author, a timestamp, and a commit message. Everything that isn&rsquo;t in Git doesn&rsquo;t survive past the next reconciliation. If you want to know what changed and when, <code>git log</code> is the answer.</p>
<hr>
<h2 id="the-gap-selfheal-doesnt-close">The gap selfHeal doesn&rsquo;t close</h2>
<p><code>selfHeal</code> reverts drift after the fact. There&rsquo;s a window — up to 3 minutes — where a drifted resource is serving traffic. For most changes, that&rsquo;s fine. For a bad resource (wrong RBAC, missing network policy, container running as root), 3 minutes is enough to be a problem.</p>
<p>The other gap: <code>selfHeal</code> doesn&rsquo;t tell you <em>who</em> made the change or generate an alert. It just silently fixes it. You need audit logging (<code>kube-apiserver --audit-log-path</code>) or an alerting rule on Argo CD&rsquo;s health events to know that drift happened.</p>
<hr>
<h2 id="prevention-kyverno">Prevention: Kyverno</h2>
<p>Kyverno is a policy engine that runs as a Kubernetes admission webhook. Every resource creation or modification goes through it before being persisted. If the resource violates a policy, Kyverno can reject it outright (enforce mode) or allow it with a warning (audit mode).</p>
<p>The policies are Kubernetes resources themselves — they live in Git, they&rsquo;re applied via GitOps, they&rsquo;re versioned. No separate policy language to learn.</p>
<p>A policy that requires readiness probes on all Deployments:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">kyverno.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ClusterPolicy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">require-readiness-probe</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validationFailureAction</span><span class="p">:</span><span class="w"> </span><span class="l">Enforce</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">rules</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">check-readiness-probe</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">match</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">any</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="nt">resources</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">kinds</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span>- <span class="l">Deployment</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">validate</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">message</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;Deployments must define a readiness probe.&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">pattern</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">template</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nt">containers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                  </span>- <span class="nt">(name)</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;*&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                    </span><span class="nt">readinessProbe</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                      </span><span class="nt">(httpGet | tcpSocket | exec)</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;*&#34;</span><span class="w">
</span></span></span></code></pre></div><p>With this policy active: <code>kubectl apply -f deployment-without-probe.yaml</code> is rejected at the API server. The error message is the one you defined in <code>message</code>. The deployment never reaches etcd.</p>
<p>A policy that blocks containers running as root:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">kyverno.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ClusterPolicy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">disallow-root-containers</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validationFailureAction</span><span class="p">:</span><span class="w"> </span><span class="l">Enforce</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">rules</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">check-runAsNonRoot</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">match</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">any</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="nt">resources</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">kinds</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="l">Deployment, StatefulSet, DaemonSet]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">validate</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">message</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;Containers must not run as root.&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">pattern</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">template</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nt">containers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                  </span>- <span class="nt">(name)</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;*&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                    </span><span class="nt">securityContext</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                      </span><span class="nt">runAsNonRoot</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span></code></pre></div><p>A policy that enforces resource limits (common in multi-tenant clusters):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">kyverno.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ClusterPolicy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">require-resource-limits</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validationFailureAction</span><span class="p">:</span><span class="w"> </span><span class="l">Enforce</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">rules</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">check-limits</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">match</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">any</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="nt">resources</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">kinds</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="l">Deployment]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">validate</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">message</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;CPU and memory limits are required.&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">pattern</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">template</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nt">containers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                  </span>- <span class="nt">resources</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                      </span><span class="nt">limits</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="nt">memory</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;?*&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="nt">cpu</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;?*&#34;</span><span class="w">
</span></span></span></code></pre></div><hr>
<h2 id="kyverno-can-also-mutate-and-generate">Kyverno can also mutate and generate</h2>
<p>Policies aren&rsquo;t only for validation. Kyverno can mutate incoming resources (add default labels, inject sidecars, set default resource requests) and generate new resources in response to events (create a NetworkPolicy whenever a new namespace is created).</p>
<p>Auto-add a standard label to every Deployment:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">kyverno.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ClusterPolicy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">add-labels</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">rules</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">add-team-label</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">match</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">any</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="nt">resources</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">kinds</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="l">Deployment]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">mutate</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">patchStrategicMerge</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">labels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">managed-by</span><span class="p">:</span><span class="w"> </span><span class="l">kyverno</span><span class="w">
</span></span></span></code></pre></div><p>Auto-create a default NetworkPolicy when a namespace is created:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">kyverno.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ClusterPolicy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">add-default-networkpolicy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">rules</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">default-deny</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">match</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">any</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="nt">resources</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">kinds</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="l">Namespace]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">generate</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">NetworkPolicy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">default-deny-all</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{request.object.metadata.name}}&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">data</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">podSelector</span><span class="p">:</span><span class="w"> </span>{}<span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">policyTypes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span>- <span class="l">Ingress</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span>- <span class="l">Egress</span><span class="w">
</span></span></span></code></pre></div><hr>
<h2 id="the-complete-drift-prevention-picture">The complete drift prevention picture</h2>
<pre tabindex="0"><code>Developer runs: kubectl apply -f bad-deployment.yaml
  → API server receives request
  → Kyverno admission webhook intercepts
  → Policy check: no readiness probe → Rejected
  → API server returns 403 with Kyverno&#39;s message
  → Resource never reaches etcd

Developer runs: kubectl edit deployment myapp (valid change, just not via Git)
  → Edit succeeds (no policy violation)
  → Argo CD reconciliation fires (within 3 minutes)
  → Diff detected: cluster state ≠ Git state
  → selfHeal: revert to Git state
  → If audit logging enabled: event recorded with username and timestamp
</code></pre><p>Git is the audit trail for what <em>should</em> be there. kube-apiserver audit logs are the trail for what <em>was attempted</em>. Kyverno is the enforcer at admission time. Argo CD is the continuous reconciler. Four layers, each with a different job.</p>
<hr>
<h2 id="what-interviewers-are-actually-testing">What interviewers are actually testing</h2>
<p>The follow-up is usually: <em>&ldquo;What&rsquo;s the difference between Kyverno and OPA Gatekeeper?&rdquo;</em></p>
<p>Both are admission webhook policy engines. The practical differences:</p>
<ul>
<li><strong>Kyverno</strong>: policies are k8s-native YAML, no separate language to learn. Generate and mutate policies built in. Easier to get started with.</li>
<li><strong>OPA Gatekeeper</strong>: policies are written in Rego, a purpose-built policy language that&rsquo;s more expressive but has a steeper learning curve. Better if you&rsquo;re already using OPA elsewhere (Terraform, microservice authorization).</li>
</ul>
<p>For a Kubernetes-only environment, Kyverno is the pragmatic choice. For a platform team that uses OPA across the stack, Gatekeeper gives you policy consistency.</p>
<p>The deeper follow-up: <em>&ldquo;How do you test policies before enforcing them?&rdquo;</em> Use <code>Audit</code> mode first (<code>validationFailureAction: Audit</code>). Violations are logged as PolicyReport objects but requests aren&rsquo;t rejected. Review the reports, fix the existing violations, then switch to <code>Enforce</code>. Never flip directly to Enforce in production — you&rsquo;ll break things that were already running.</p>
<hr>
<p><em>This is part of a series on Kubernetes interview questions. Previously: <a href="/posts/k8s-network-isolation/">network isolation between services</a>.</em></p>
]]></content:encoded></item><item><title>🛡️ How Do You Prevent a Compromised Pod From Calling Your Database?</title><link>https://blog.hippotion.com/posts/k8s-network-isolation/</link><pubDate>Fri, 23 May 2025 00:00:00 +0000</pubDate><guid>https://blog.hippotion.com/posts/k8s-network-isolation/</guid><description>Default Kubernetes is a flat network. Every pod can reach every other pod. In a cluster with ten services, that&amp;rsquo;s ten potential blast radiuses instead of one.</description><content:encoded><![CDATA[<h2 id="the-question">The question</h2>
<p><em>&ldquo;How do you enforce network isolation between services in a Kubernetes cluster?&rdquo;</em></p>
<p>The default Kubernetes network model is flat. Every pod can reach every other pod, in any namespace, on any port. There are no firewalls, no ACLs, no segmentation. A compromised frontend pod can connect directly to your PostgreSQL port, your Redis port, your internal admin API, and every other service in the cluster.</p>
<p>This is intentional — Kubernetes doesn&rsquo;t assume you want isolation, because not everyone does. But if you do want it, you need to add it.</p>
<hr>
<h2 id="networkpolicy-the-primitive">NetworkPolicy: the primitive</h2>
<p>A <code>NetworkPolicy</code> is a Kubernetes resource that selects a set of pods and defines what traffic is allowed to reach them (ingress) and what traffic they&rsquo;re allowed to send (egress). Traffic that isn&rsquo;t explicitly allowed is dropped.</p>
<p>The catch: <code>NetworkPolicy</code> resources have no effect unless your CNI plugin supports them. The default k3s CNI (Flannel) does not. Calico, Cilium, and Canal do. If you&rsquo;re running Flannel and you apply a NetworkPolicy, it will be silently ignored — no error, no warning.</p>
<hr>
<h2 id="the-default-deny-pattern">The default-deny pattern</h2>
<p>The correct starting point is a default-deny policy that blocks everything, applied to the namespace. You then add explicit allow policies for the traffic you actually need.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># Block all ingress and egress in this namespace by default</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">networking.k8s.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">NetworkPolicy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">default-deny-all</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">myapp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">podSelector</span><span class="p">:</span><span class="w"> </span>{}<span class="w">        </span><span class="c"># matches all pods in the namespace</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">policyTypes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">Ingress</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">Egress</span><span class="w">
</span></span></span></code></pre></div><p>With this in place, your pods can&rsquo;t receive traffic and can&rsquo;t send traffic. You then add back what you need.</p>
<hr>
<h2 id="allowing-specific-traffic">Allowing specific traffic</h2>
<p>Allow the web frontend to receive traffic from the ingress controller:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">networking.k8s.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">NetworkPolicy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">allow-ingress-from-traefik</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">myapp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">podSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l">frontend</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">policyTypes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">Ingress</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">ingress</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">from</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">namespaceSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">kubernetes.io/metadata.name</span><span class="p">:</span><span class="w"> </span><span class="l">sys-traefik</span><span class="w">
</span></span></span></code></pre></div><p>Allow the backend to talk to PostgreSQL:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">networking.k8s.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">NetworkPolicy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">allow-egress-to-postgres</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">myapp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">podSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l">backend</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">policyTypes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">Egress</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">egress</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">to</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">podSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l">postgres</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">5432</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l">TCP</span><span class="w">
</span></span></span></code></pre></div><p>After these two policies: the frontend receives traffic from Traefik, and the backend can reach Postgres. The frontend cannot reach Postgres. The backend cannot receive traffic from the ingress controller. Neither can call anything else.</p>
<hr>
<h2 id="the-dns-gotcha">The DNS gotcha</h2>
<p>Once you add a default-deny egress policy, DNS stops working. Your pods can no longer resolve service names because they can&rsquo;t reach <code>kube-dns</code> in the <code>kube-system</code> namespace.</p>
<p>You need to explicitly allow it:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">networking.k8s.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">NetworkPolicy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">allow-egress-dns</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">myapp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">podSelector</span><span class="p">:</span><span class="w"> </span>{}<span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">policyTypes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">Egress</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">egress</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">to</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">namespaceSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">kubernetes.io/metadata.name</span><span class="p">:</span><span class="w"> </span><span class="l">kube-system</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">53</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l">UDP</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">53</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l">TCP</span><span class="w">
</span></span></span></code></pre></div><p>Missing this is the most common reason &ldquo;everything broke after I added NetworkPolicies&rdquo;. Add it to every namespace that has a default-deny policy.</p>
<hr>
<h2 id="cilium-the-same-model-with-more-power">Cilium: the same model with more power</h2>
<p>Cilium implements the standard <code>NetworkPolicy</code> API and adds its own <code>CiliumNetworkPolicy</code> CRD with L7 capabilities.</p>
<p>Standard NetworkPolicy works at L3/L4 — IP addresses and ports. Cilium&rsquo;s CRD adds:</p>
<p><strong>L7 HTTP filtering</strong>: allow specific HTTP methods and paths, not just port 8080.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">cilium.io/v2</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">CiliumNetworkPolicy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">allow-api-reads</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">myapp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">endpointSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l">api</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">ingress</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">fromEndpoints</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l">frontend</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">toPorts</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span>- <span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;8080&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l">TCP</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">rules</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">http</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span>- <span class="nt">method</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;GET&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;/api/v1/.*&#34;</span><span class="w">
</span></span></span></code></pre></div><p><strong>DNS-based egress</strong>: allow egress to <code>github.com</code> by hostname rather than IP address. This matters for external services with dynamic IPs.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">egress</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">toFQDNs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">matchName</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;github.com&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">toPorts</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;443&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l">TCP</span><span class="w">
</span></span></span></code></pre></div><p><strong>Identity-based policies</strong>: Cilium assigns a cryptographic identity to each pod based on its labels. Policies are enforced by identity, not IP address. Pod restarts (which change IPs) don&rsquo;t break policy enforcement.</p>
<hr>
<h2 id="what-a-real-namespace-policy-set-looks-like">What a real namespace policy set looks like</h2>
<p>For a typical web app with frontend, backend, and database:</p>
<pre tabindex="0"><code>Namespace: myapp
├── default-deny-all (ingress + egress, all pods)
├── allow-egress-dns (egress, all pods, port 53)
├── allow-ingress-frontend (ingress frontend, from sys-traefik namespace)
├── allow-egress-frontend-to-backend (egress frontend, to backend:8080)
├── allow-ingress-backend (ingress backend, from frontend)
├── allow-egress-backend-to-postgres (egress backend, to postgres:5432)
└── allow-ingress-postgres (ingress postgres, from backend)
</code></pre><p>Eight policies. The database has exactly one inbound path: from the backend. The frontend has no path to the database at all. A compromised frontend pod cannot scan the internal network — egress to arbitrary destinations is blocked.</p>
<hr>
<h2 id="what-interviewers-are-actually-testing">What interviewers are actually testing</h2>
<p>The follow-up is usually: <em>&ldquo;How do you manage this at scale? Writing NetworkPolicies for every namespace by hand doesn&rsquo;t scale.&rdquo;</em></p>
<p>The answer: you don&rsquo;t write them by hand. You template them. In a GitOps setup, your namespace configuration declares what network access the service needs in a structured form, and a Helm chart or operator generates the actual NetworkPolicy resources from those declarations.</p>
<p>For example, an <code>applications.yml</code> entry might look like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">networkPolicies</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">denyAll</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">allowIngressFromIngress</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">allowEgressToNamespaces</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&#34;sys-postgres&#34;</span><span class="p">]</span><span class="w">
</span></span></span></code></pre></div><p>And a Helm chart translates that into four concrete NetworkPolicy objects. The developer declares intent; the platform enforces it. No one writes raw YAML for each namespace.</p>
<p>The second follow-up: <em>&ldquo;What about east-west traffic between services in the same namespace?&rdquo;</em> Add <code>allowIntraNamespace: true</code> as a flag that generates a policy allowing all pod-to-pod traffic within the namespace, while still blocking cross-namespace traffic.</p>
<hr>
<p><em>This is part of a series on Kubernetes interview questions. Previously: <a href="/posts/k8s-zero-downtime/">zero-downtime deployments</a>. Next: <a href="/posts/k8s-config-drift/">preventing configuration drift</a>.</em></p>
]]></content:encoded></item><item><title>🔑 Deploy to Kubernetes Without Storing Any Cluster Credentials in CI</title><link>https://blog.hippotion.com/posts/k8s-cicd-no-credentials/</link><pubDate>Fri, 09 May 2025 00:00:00 +0000</pubDate><guid>https://blog.hippotion.com/posts/k8s-cicd-no-credentials/</guid><description>A common interview question in 2026. If your answer is &amp;lsquo;kubeconfig in a CI secret&amp;rsquo;, you&amp;rsquo;re not wrong — but you&amp;rsquo;re also not getting the job.</description><content:encoded><![CDATA[<h2 id="the-question">The question</h2>
<p><em>&ldquo;How would you design a CI/CD pipeline that deploys to Kubernetes without storing any cluster credentials anywhere?&rdquo;</em></p>
<p>The expected wrong answer: export your kubeconfig, base64-encode it, paste it into a CI secret named <code>KUBE_CONFIG</code>, and call it a day. This works. Most clusters that got hacked had this setup.</p>
<p>There are two correct answers in 2026, and which one you reach for depends on what you&rsquo;re actually deploying.</p>
<hr>
<h2 id="answer-1-gitops-the-one-your-interviewer-probably-wants">Answer 1: GitOps (the one your interviewer probably wants)</h2>
<p>In a GitOps setup, your CI pipeline never touches the cluster. It can&rsquo;t leak credentials it doesn&rsquo;t have.</p>
<p>The flow:</p>
<pre tabindex="0"><code>Developer pushes code
  → CI builds and tests
  → CI updates the image tag in the Git repo (a commit, not a kubectl command)
  → Argo CD detects the change
  → Argo CD applies it to the cluster
</code></pre><p>The cluster reaches out to Git. CI never reaches into the cluster. The only thing with cluster credentials is Argo CD itself — running inside the cluster, with no credentials to leak externally.</p>
<p>For self-hosted setups on Hetzner or Vultr, this is particularly clean because there&rsquo;s no cloud IAM to configure. You point Argo CD at your GitLab repo, tell it which branch to watch, and you&rsquo;re done.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># The Argo CD Application CRD — the only thing you need</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">argoproj.io/v1alpha1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Application</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">myapp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">argocd</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">source</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">repoURL</span><span class="p">:</span><span class="w"> </span><span class="l">https://gitlab.example.com/myorg/myapp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">targetRevision</span><span class="p">:</span><span class="w"> </span><span class="l">main</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="l">helm-charts/myapp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">destination</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">server</span><span class="p">:</span><span class="w"> </span><span class="l">https://kubernetes.default.svc</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">myapp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">syncPolicy</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">automated</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">prune</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">selfHeal</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span></code></pre></div><p><code>selfHeal: true</code> means if someone manually <code>kubectl apply</code>s something, Argo CD reverts it. The Git repo is the only source of truth.</p>
<p>The CI image-tag update step looks like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># .gitlab-ci.yml</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">deploy</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">stage</span><span class="p">:</span><span class="w"> </span><span class="l">deploy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">script</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="p">|</span><span class="sd">
</span></span></span><span class="line"><span class="cl"><span class="sd">      # Update the image tag in values.yaml and push
</span></span></span><span class="line"><span class="cl"><span class="sd">      sed -i &#34;s/tag: .*/tag: ${CI_COMMIT_SHORT_SHA}/&#34; values/myapp.yml
</span></span></span><span class="line"><span class="cl"><span class="sd">      git config user.email &#34;ci@example.com&#34;
</span></span></span><span class="line"><span class="cl"><span class="sd">      git config user.name &#34;CI&#34;
</span></span></span><span class="line"><span class="cl"><span class="sd">      git add values/myapp.yml
</span></span></span><span class="line"><span class="cl"><span class="sd">      git commit -m &#34;chore: bump myapp to ${CI_COMMIT_SHORT_SHA}&#34;
</span></span></span><span class="line"><span class="cl"><span class="sd">      git push</span><span class="w">
</span></span></span></code></pre></div><p>CI needs write access to the Git repo — but that&rsquo;s a deploy key, not a cluster credential. If it leaks, someone can push code. You&rsquo;d rotate the deploy key and audit the commits. If a cluster credential leaks, someone owns your cluster.</p>
<hr>
<h2 id="answer-2-oidc-federation-for-when-you-genuinely-need-push-based">Answer 2: OIDC federation (for when you genuinely need push-based)</h2>
<p>Some operations don&rsquo;t fit the GitOps model. Infrastructure provisioning (<code>terraform apply</code>), one-off database migrations, or initial cluster bootstrapping — these need direct cluster access. The correct pattern here is OIDC federation.</p>
<p>The idea: your CI platform (GitLab, GitHub Actions) already issues JWT tokens to every job. These JWTs are signed by the CI platform and contain claims like which repo, which branch, which pipeline triggered the job. You configure your Kubernetes API server to trust those JWTs, and the CI job authenticates directly using the token it already has.</p>
<p>No stored credentials. Every job gets a fresh token. The token expires when the job ends.</p>
<p>For a self-hosted GitLab, configure your k8s API server to trust GitLab as an OIDC issuer:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># /etc/rancher/k3s/config.yaml (or kube-apiserver flags)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kube-apiserver-arg</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="s2">&#34;oidc-issuer-url=https://gitlab.example.com&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="s2">&#34;oidc-client-id=your_client_id&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="s2">&#34;oidc-username-claim=sub&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="s2">&#34;oidc-groups-claim=groups_direct&#34;</span><span class="w">
</span></span></span></code></pre></div><p>Then create a <code>ClusterRoleBinding</code> that maps a specific GitLab identity to a Kubernetes role:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">rbac.authorization.k8s.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ClusterRoleBinding</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">gitlab-ci-deployer</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">subjects</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">User</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;project_path:myorg/myapp:ref_type:branch:ref:main&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">apiGroup</span><span class="p">:</span><span class="w"> </span><span class="l">rbac.authorization.k8s.io</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">roleRef</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ClusterRole</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">deploy-role</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">apiGroup</span><span class="p">:</span><span class="w"> </span><span class="l">rbac.authorization.k8s.io</span><span class="w">
</span></span></span></code></pre></div><p>The subject name is the <code>sub</code> claim from the GitLab JWT — it encodes the repo path and branch. Only jobs running on <code>main</code> in <code>myorg/myapp</code> get this binding. A job on a feature branch gets nothing.</p>
<p>In the CI job:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">deploy</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">stage</span><span class="p">:</span><span class="w"> </span><span class="l">deploy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">id_tokens</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">K8S_TOKEN</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">aud</span><span class="p">:</span><span class="w"> </span><span class="l">your_client_id</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">script</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="p">|</span><span class="sd">
</span></span></span><span class="line"><span class="cl"><span class="sd">      kubectl config set-credentials gitlab-ci \
</span></span></span><span class="line"><span class="cl"><span class="sd">        --token=&#34;${K8S_TOKEN}&#34;
</span></span></span><span class="line"><span class="cl"><span class="sd">      kubectl config set-context deploy \
</span></span></span><span class="line"><span class="cl"><span class="sd">        --cluster=mycluster \
</span></span></span><span class="line"><span class="cl"><span class="sd">        --user=gitlab-ci
</span></span></span><span class="line"><span class="cl"><span class="sd">      kubectl config use-context deploy
</span></span></span><span class="line"><span class="cl"><span class="sd">      kubectl rollout restart deployment/myapp -n myapp</span><span class="w">
</span></span></span></code></pre></div><p>The token in <code>K8S_TOKEN</code> is injected by GitLab. It expires with the job. The API server validates the signature against GitLab&rsquo;s JWKS endpoint on every request.</p>
<hr>
<h2 id="which-one-to-use">Which one to use</h2>
<table>
	<thead>
			<tr>
					<th></th>
					<th>GitOps</th>
					<th>OIDC federation</th>
			</tr>
	</thead>
	<tbody>
			<tr>
					<td>CI needs cluster access</td>
					<td>No</td>
					<td>Yes (short-lived token)</td>
			</tr>
			<tr>
					<td>Audit trail</td>
					<td>Git history</td>
					<td>kube-apiserver audit log</td>
			</tr>
			<tr>
					<td>Revocability</td>
					<td>Revert the commit</td>
					<td>Token expires with the job</td>
			</tr>
			<tr>
					<td>Self-hosted setup effort</td>
					<td>Low</td>
					<td>Moderate (OIDC config)</td>
			</tr>
			<tr>
					<td>Works for infra provisioning</td>
					<td>Not really</td>
					<td>Yes</td>
			</tr>
	</tbody>
</table>
<p>For application deployments: GitOps. The cluster reconciles continuously, drift is impossible, and CI is completely decoupled from cluster state.</p>
<p>For infrastructure provisioning or one-off operations: OIDC federation. Short-lived credentials, branch-scoped permissions, nothing to rotate.</p>
<p>What you should never do: store a kubeconfig or a long-lived ServiceAccount token in CI secrets. Not because it&rsquo;s hard to make work — it&rsquo;s easy — but because the blast radius of a leak is unbounded, there&rsquo;s no audit trail, and there&rsquo;s no expiry. Everything that goes wrong with static secrets goes wrong eventually.</p>
<hr>
<p><em>This is part of a series on Kubernetes interview questions. Next: <a href="/posts/k8s-gitops-secrets/">how to handle secrets in a GitOps repository</a>.</em></p>
]]></content:encoded></item><item><title>🤫 How Do You Handle Secrets in a GitOps Repository?</title><link>https://blog.hippotion.com/posts/k8s-gitops-secrets/</link><pubDate>Fri, 25 Apr 2025 00:00:00 +0000</pubDate><guid>https://blog.hippotion.com/posts/k8s-gitops-secrets/</guid><description>GitOps says Git is the source of truth. Secrets say don&amp;rsquo;t put them in Git. These two things appear to be in direct conflict. They&amp;rsquo;re not.</description><content:encoded><![CDATA[<h2 id="the-question">The question</h2>
<p><em>&ldquo;You&rsquo;re using GitOps — everything goes through Git. How do you handle secrets?&rdquo;</em></p>
<p>The wrong answer: base64-encode them and commit them as Kubernetes <code>Secret</code> objects. Base64 is not encryption. Anyone with read access to the repo has your secrets. If the repo is public, everyone does.</p>
<p>The slightly better wrong answer: use a private repo and just not think about it. This works until a deploy key leaks, someone joins and then leaves the company, or you need to rotate one secret and have to find every place it&rsquo;s referenced.</p>
<p>There are three real answers. They make different tradeoffs.</p>
<hr>
<h2 id="the-constraint">The constraint</h2>
<p>The constraint is actually tighter than &ldquo;don&rsquo;t commit secrets&rdquo;. It&rsquo;s: <strong>your Git repo should be safe to make public at any point</strong>, and <strong>secrets must be rotatable without touching Git</strong>.</p>
<p>If rotating a password requires a new commit, someone has to be awake to merge and deploy it. That&rsquo;s not how you want to handle a 3am incident.</p>
<hr>
<h2 id="option-1-external-secrets-operator--vault">Option 1: External Secrets Operator + Vault</h2>
<p>This is the most robust pattern and the one worth knowing for interviews.</p>
<p>The idea: secrets live in a dedicated secret store (HashiCorp Vault, or a cloud equivalent). A Kubernetes operator called ESO watches <code>ExternalSecret</code> CRD objects in the cluster and syncs the referenced secret into a real Kubernetes <code>Secret</code>. The CRD is safe to commit — it says where the secret lives, not what it is.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># This lives in Git — safe to commit</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">external-secrets.io/v1beta1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ExternalSecret</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">myapp-db-credentials</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">myapp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">refreshInterval</span><span class="p">:</span><span class="w"> </span><span class="l">1h</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">secretStoreRef</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">vault</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ClusterSecretStore</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">target</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">myapp-db-credentials  </span><span class="w"> </span><span class="c"># the k8s Secret it creates</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">data</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">secretKey</span><span class="p">:</span><span class="w"> </span><span class="l">DB_PASSWORD</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">remoteRef</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">key</span><span class="p">:</span><span class="w"> </span><span class="l">secret/myapp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">property</span><span class="p">:</span><span class="w"> </span><span class="l">db-password</span><span class="w">
</span></span></span></code></pre></div><p>Rotation: you update the secret in Vault. ESO syncs it to the cluster within <code>refreshInterval</code>. No Git commit, no deployment. The pod reads the updated <code>Secret</code> on the next restart (or immediately if you mount it as an env var and the app handles <code>SIGHUP</code>).</p>
<p>Audit trail: Vault logs every read and write. You know exactly which service account read which secret at what time.</p>
<p>The cost: you&rsquo;re running Vault. For a homelab or small team, that&rsquo;s an extra thing to operate. For production, it&rsquo;s worth it.</p>
<p>Self-hosted setup:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># ClusterSecretStore — connects ESO to your Vault instance</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">external-secrets.io/v1beta1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ClusterSecretStore</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">vault</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">provider</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">vault</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">server</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;http://sys-vault.sys-vault.svc.cluster.local:8200&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;secret&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">version</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;v2&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">auth</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">kubernetes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">mountPath</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;kubernetes&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">role</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;external-secrets&#34;</span><span class="w">
</span></span></span></code></pre></div><p>ESO authenticates to Vault using the pod&rsquo;s Kubernetes ServiceAccount token. Vault validates it against the cluster&rsquo;s token review endpoint. No static credentials anywhere.</p>
<hr>
<h2 id="option-2-sealed-secrets">Option 2: Sealed Secrets</h2>
<p>Sealed Secrets uses asymmetric encryption. The cluster holds a private key. You use the <code>kubeseal</code> CLI to encrypt a secret with the cluster&rsquo;s public key. The resulting <code>SealedSecret</code> object is safe to commit — only the cluster can decrypt it.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="c1"># Encrypt a secret for committing to Git</span>
</span></span><span class="line"><span class="cl">kubectl create secret generic myapp-db <span class="se">\
</span></span></span><span class="line"><span class="cl">  --from-literal<span class="o">=</span><span class="nv">DB_PASSWORD</span><span class="o">=</span>hunter2 <span class="se">\
</span></span></span><span class="line"><span class="cl">  --dry-run<span class="o">=</span>client -o yaml <span class="se">\
</span></span></span><span class="line"><span class="cl">  <span class="p">|</span> kubeseal <span class="se">\
</span></span></span><span class="line"><span class="cl">  &gt; sealed-secrets/myapp-db.yaml
</span></span></code></pre></div><p>The resulting YAML looks like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">bitnami.com/v1alpha1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">SealedSecret</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">myapp-db</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">myapp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">encryptedData</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">DB_PASSWORD</span><span class="p">:</span><span class="w"> </span><span class="l">AgBy3i4OJSWK+PiTySYZZA9rO43cGDEq...</span><span class="w">
</span></span></span></code></pre></div><p>This gets committed. The Sealed Secrets controller in the cluster decrypts it and creates the real <code>Secret</code> automatically.</p>
<p>The tradeoff: rotation means re-sealing. You need the cluster&rsquo;s public key (which is public) and access to the plaintext secret. You commit a new <code>SealedSecret</code>. That&rsquo;s a Git commit, which means a review, a merge, and a deploy. For a 3am incident, that&rsquo;s a lot of friction.</p>
<p>Also: if the cluster&rsquo;s private key is lost, you can&rsquo;t decrypt any of your sealed secrets. Back up the private key.</p>
<p>Good fit for: small teams, homelab, situations where secrets change rarely and the GitOps review process is actually desirable.</p>
<hr>
<h2 id="option-3-sops">Option 3: SOPS</h2>
<p>SOPS (Secrets OPerationS) encrypts files at rest using age keys or cloud KMS. You commit encrypted files. CI decrypts them during deployment using a key it holds in memory (not stored in Git).</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="c1"># Encrypt a file for Git</span>
</span></span><span class="line"><span class="cl">sops --encrypt --age age1ql3z7hjy54pw3hyww5ayyfg7zqgvc7w3j2elw8zmrj2kg5sfn9aqmcac8q <span class="se">\
</span></span></span><span class="line"><span class="cl">  secrets/myapp.yaml &gt; secrets/myapp.enc.yaml
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># In CI: decrypt to temp file, apply, delete</span>
</span></span><span class="line"><span class="cl">sops --decrypt secrets/myapp.enc.yaml <span class="p">|</span> kubectl apply -f -
</span></span></code></pre></div><p>The difference from Sealed Secrets: SOPS encrypts at the file level, not the k8s object level. You can use it outside of Kubernetes (application configs, Terraform variables). The key can live in the CI environment, a cloud KMS, or a personal age key.</p>
<p>The tradeoff: CI needs the decryption key, which puts you back in &ldquo;secret in CI&rdquo; territory — just for the encryption key rather than the actual secrets. If you use a cloud KMS, OIDC federation handles that (no stored key). If you use an age key, it lives in CI secrets.</p>
<p>Good fit for: teams already using Helm and Helm Secrets, polyglot environments where not everything is Kubernetes, small teams where Vault feels like overengineering.</p>
<hr>
<h2 id="comparison">Comparison</h2>
<table>
	<thead>
			<tr>
					<th></th>
					<th>ESO + Vault</th>
					<th>Sealed Secrets</th>
					<th>SOPS</th>
			</tr>
	</thead>
	<tbody>
			<tr>
					<td>Rotation without Git commit</td>
					<td>Yes</td>
					<td>No</td>
					<td>Depends</td>
			</tr>
			<tr>
					<td>Audit trail</td>
					<td>Full (Vault)</td>
					<td>None</td>
					<td>Depends on KMS</td>
			</tr>
			<tr>
					<td>Complexity</td>
					<td>High</td>
					<td>Low</td>
					<td>Medium</td>
			</tr>
			<tr>
					<td>Works outside k8s</td>
					<td>With effort</td>
					<td>No</td>
					<td>Yes</td>
			</tr>
			<tr>
					<td>Recovery if key lost</td>
					<td>Vault backup</td>
					<td>Lose all secrets</td>
					<td>Key backup</td>
			</tr>
			<tr>
					<td>CI needs secret material</td>
					<td>No</td>
					<td>No</td>
					<td>Yes (decrypt key)</td>
			</tr>
	</tbody>
</table>
<hr>
<h2 id="what-interviewers-are-actually-testing">What interviewers are actually testing</h2>
<p>The interesting follow-up question is: <em>&ldquo;How do you rotate a secret without downtime?&rdquo;</em></p>
<p>The answer requires you to understand that pods mount <code>Secret</code> objects at startup. Updating the <code>Secret</code> in Kubernetes doesn&rsquo;t automatically restart the pod. Your options are:</p>
<ol>
<li>Mount the secret as a volume and have the app watch for file changes (good)</li>
<li>Restart the deployment after rotation (<code>kubectl rollout restart</code>, automatable)</li>
<li>Use a sidecar like Vault Agent Injector that handles refresh in-process (complex but zero-restart)</li>
</ol>
<p>The correct answer depends on the app. An API key that can be rotated gradually is different from a database password where the old one is invalidated immediately.</p>
<hr>
<p><em>This is part of a series on Kubernetes interview questions. Previously: <a href="/posts/k8s-cicd-no-credentials/">deploying without cluster credentials</a>. Next: <a href="/posts/k8s-zero-downtime/">zero-downtime deployments</a>.</em></p>
]]></content:encoded></item><item><title>📱 Building a QR Code Login for a Homelab (And Learning oauth2-proxy's Session Format the Hard Way)</title><link>https://blog.hippotion.com/posts/qr-device-login/</link><pubDate>Fri, 14 Mar 2025 00:00:00 +0000</pubDate><guid>https://blog.hippotion.com/posts/qr-device-login/</guid><description>My homelab uses oauth2-proxy for GitLab SSO. I wanted a QR code login for the TV dashboard. Two days and four complete rewrites later, I knew more about oauth2-proxy&amp;rsquo;s session format than I ever planned to.</description><content:encoded><![CDATA[<h2 id="the-problem">The problem</h2>
<p>My homelab runs a single-node k3s cluster with a full GitOps stack — Argo CD, Traefik, oauth2-proxy for GitLab SSO, the usual over-engineered personal project. One thing that always bothered me: when I want to show the Homer dashboard on the living room TV, I have to type my credentials on a keyboard that wasn&rsquo;t designed for the living room.</p>
<p>The obvious fix is a QR code. Phone scans it, phone authenticates, TV unlocks. Conceptually simple. In practice, a two-day debugging adventure that took me deep into oauth2-proxy&rsquo;s source code.</p>
<hr>
<h2 id="the-design">The design</h2>
<p>The flow I wanted:</p>
<ol>
<li>TV opens <code>qr.hippotion.com</code>, shows a QR code and polls for completion</li>
<li>Phone scans, opens the device URL, taps &ldquo;Continue with GitLab&rdquo;</li>
<li>Phone completes GitLab OAuth</li>
<li>Server marks the session as ready</li>
<li>TV&rsquo;s poll fires, gets redirected to Homer</li>
<li>Later: phone taps &ldquo;End Session&rdquo;, TV locks immediately</li>
</ol>
<p>This is the <a href="https://datatracker.ietf.org/doc/html/rfc8628">OAuth 2.0 Device Authorization Grant</a> pattern adapted for a single trusted user. I wrote it in Go with Redis for session storage. The service generates a device token, stores it with a 5-minute TTL, and uses it as the OAuth <code>state</code> parameter. The phone completes GitLab OAuth and the callback handler links the resulting session to the device token. The TV&rsquo;s poll loop picks it up and redirects.</p>
<p>That part was straightforward. The hard part was making the TV&rsquo;s session work for <em>all</em> protected apps on the domain, not just the QR page.</p>
<hr>
<h2 id="the-oauth2-proxy-problem">The oauth2-proxy problem</h2>
<p>My homelab uses oauth2-proxy as a ForwardAuth backend for Traefik. Every protected app (<code>home.hippotion.com</code>, <code>argo.hippotion.com</code>, <code>grafana.hippotion.com</code>, etc.) sends unauthenticated requests through oauth2-proxy, which redirects to GitLab if no valid <code>_oauth2_proxy</code> session cookie is present.</p>
<p>The QR auth service creates its own session cookie (<code>qr_session</code>), but oauth2-proxy knows nothing about it. After QR login, clicking any link from Homer would immediately ask for GitLab credentials again.</p>
<p>The obvious solution: after the phone authenticates, set a valid <code>_oauth2_proxy</code> cookie on the TV&rsquo;s browser. If I can forge a cookie that oauth2-proxy accepts, all apps work instantly.</p>
<p>How hard can it be?</p>
<hr>
<h2 id="attempt-1-aes-gcm--json">Attempt 1: AES-GCM + JSON</h2>
<p>I looked at the oauth2-proxy source and found what looked like the session format: a JSON struct with short field names (<code>&quot;e&quot;</code> for email, <code>&quot;ca&quot;</code> for created-at, etc.), encrypted with AES-GCM, base64url-encoded.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="kd">type</span><span class="w"> </span><span class="nx">oauthSession</span><span class="w"> </span><span class="kd">struct</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">CreatedAt</span><span class="w"> </span><span class="o">*</span><span class="nx">time</span><span class="p">.</span><span class="nx">Time</span><span class="w"> </span><span class="s">`json:&#34;ca&#34;`</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">ExpiresOn</span><span class="w"> </span><span class="o">*</span><span class="nx">time</span><span class="p">.</span><span class="nx">Time</span><span class="w"> </span><span class="s">`json:&#34;ea&#34;`</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">Email</span><span class="w">     </span><span class="kt">string</span><span class="w">     </span><span class="s">`json:&#34;e&#34;`</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">User</span><span class="w">      </span><span class="kt">string</span><span class="w">     </span><span class="s">`json:&#34;u&#34;`</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>SHA256-hash the cookie secret → 32-byte AES key → GCM encrypt → base64url encode. Set as <code>_oauth2_proxy</code> cookie. Clean, simple, wrong.</p>
<p>oauth2-proxy returned 302 every time. I added debug logging to print the cookie value, copied it, and tested it directly against the ForwardAuth endpoint with curl. The logs revealed everything:</p>
<pre tabindex="0"><code>Error loading cookied session: cookie signature not valid, removing session
</code></pre><p><em>Cookie signature not valid.</em> Not &ldquo;decryption failed&rdquo;, not &ldquo;session expired&rdquo;. A signature check.</p>
<hr>
<h2 id="finding-the-real-format">Finding the real format</h2>
<p>The error came from <code>pkg/middleware/stored_session.go:94</code>. I fetched the source:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="nx">val</span><span class="p">,</span><span class="w"> </span><span class="nx">_</span><span class="p">,</span><span class="w"> </span><span class="nx">ok</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="nx">encryption</span><span class="p">.</span><span class="nf">Validate</span><span class="p">(</span><span class="nx">c</span><span class="p">,</span><span class="w"> </span><span class="nx">secret</span><span class="p">,</span><span class="w"> </span><span class="nx">s</span><span class="p">.</span><span class="nx">Cookie</span><span class="p">.</span><span class="nx">Expire</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">if</span><span class="w"> </span><span class="p">!</span><span class="nx">ok</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">return</span><span class="w"> </span><span class="kc">nil</span><span class="p">,</span><span class="w"> </span><span class="nx">errors</span><span class="p">.</span><span class="nf">New</span><span class="p">(</span><span class="s">&#34;cookie signature not valid&#34;</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><code>encryption.Validate</code> splits the cookie value on <code>|</code> and expects three parts. Looking at <code>utils.go</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="kd">func</span><span class="w"> </span><span class="nf">Validate</span><span class="p">(</span><span class="nx">cookie</span><span class="w"> </span><span class="o">*</span><span class="nx">http</span><span class="p">.</span><span class="nx">Cookie</span><span class="p">,</span><span class="w"> </span><span class="nx">seed</span><span class="w"> </span><span class="kt">string</span><span class="p">,</span><span class="w"> </span><span class="nx">expiration</span><span class="w"> </span><span class="nx">time</span><span class="p">.</span><span class="nx">Duration</span><span class="p">)</span><span class="w"> </span><span class="p">(</span><span class="nx">value</span><span class="w"> </span><span class="p">[]</span><span class="kt">byte</span><span class="p">,</span><span class="w"> </span><span class="nx">t</span><span class="w"> </span><span class="nx">time</span><span class="p">.</span><span class="nx">Time</span><span class="p">,</span><span class="w"> </span><span class="nx">ok</span><span class="w"> </span><span class="kt">bool</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">parts</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="nx">strings</span><span class="p">.</span><span class="nf">Split</span><span class="p">(</span><span class="nx">cookie</span><span class="p">.</span><span class="nx">Value</span><span class="p">,</span><span class="w"> </span><span class="s">&#34;|&#34;</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="nb">len</span><span class="p">(</span><span class="nx">parts</span><span class="p">)</span><span class="w"> </span><span class="o">!=</span><span class="w"> </span><span class="mi">3</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">return</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="nf">checkSignature</span><span class="p">(</span><span class="nx">parts</span><span class="p">[</span><span class="mi">2</span><span class="p">],</span><span class="w"> </span><span class="nx">seed</span><span class="p">,</span><span class="w"> </span><span class="nx">cookie</span><span class="p">.</span><span class="nx">Name</span><span class="p">,</span><span class="w"> </span><span class="nx">parts</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span><span class="w"> </span><span class="nx">parts</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// ...</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The cookie format is <code>encryptedValue|timestamp|hmac</code>. My cookie was just <code>encryptedValue</code>. Three-part, not one. First problem found.</p>
<p>For the HMAC, I needed to verify against a real cookie to get the key format right. oauth2-proxy sets <code>_oauth2_proxy_csrf</code> cookies during the login flow — I captured one from a 302 response and reverse-engineered it in Python:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="n">key</span> <span class="o">=</span> <span class="n">secret_raw</span><span class="o">.</span><span class="n">encode</span><span class="p">()</span>  <span class="c1"># raw string, not decoded</span>
</span></span><span class="line"><span class="cl"><span class="n">data</span> <span class="o">=</span> <span class="p">(</span><span class="n">cookie_name</span> <span class="o">+</span> <span class="n">enc_val</span> <span class="o">+</span> <span class="n">ts</span><span class="p">)</span><span class="o">.</span><span class="n">encode</span><span class="p">()</span>  <span class="c1"># concatenated, NO separators</span>
</span></span><span class="line"><span class="cl"><span class="n">sig</span> <span class="o">=</span> <span class="n">base64</span><span class="o">.</span><span class="n">urlsafe_b64encode</span><span class="p">(</span><span class="n">hmac</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">hashlib</span><span class="o">.</span><span class="n">sha256</span><span class="p">)</span><span class="o">.</span><span class="n">digest</span><span class="p">())</span>
</span></span></code></pre></div><p>Two surprises: the HMAC key is the <strong>raw cookie secret string</strong> (not base64-decoded), and the input is a <strong>bare concatenation</strong> with no <code>|</code> separators between fields.</p>
<p>I ran the test. The CSRF cookie&rsquo;s signature matched. I had the format.</p>
<p>But oauth2-proxy still rejected the session.</p>
<hr>
<h2 id="the-wrong-cipher">The wrong cipher</h2>
<p>I switched from AES-GCM to the correct HMAC format and tried again. Still 302. <code>cookie signature not valid</code> again.</p>
<p>Wait — was it even getting to the signature check? If decryption failed first, it wouldn&rsquo;t reach that error. I added more debug logging to print the full cookie value and tested it with Python&rsquo;s <code>cryptography</code> library:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="n">candidates</span> <span class="o">=</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="s1">&#39;24-byte std-b64 decode&#39;</span><span class="p">:</span>  <span class="n">base64</span><span class="o">.</span><span class="n">b64decode</span><span class="p">(</span><span class="n">secret_str</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="s1">&#39;32-byte raw string&#39;</span><span class="p">:</span>      <span class="n">secret_str</span><span class="o">.</span><span class="n">encode</span><span class="p">(),</span>
</span></span><span class="line"><span class="cl">    <span class="s1">&#39;32-byte sha256 of b64&#39;</span><span class="p">:</span>   <span class="n">hashlib</span><span class="o">.</span><span class="n">sha256</span><span class="p">(</span><span class="n">base64</span><span class="o">.</span><span class="n">b64decode</span><span class="p">(</span><span class="n">secret_str</span><span class="p">))</span><span class="o">.</span><span class="n">digest</span><span class="p">(),</span>
</span></span><span class="line"><span class="cl">    <span class="o">...</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">label</span><span class="p">,</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">candidates</span><span class="o">.</span><span class="n">items</span><span class="p">():</span>
</span></span><span class="line"><span class="cl">    <span class="k">try</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">pt</span> <span class="o">=</span> <span class="n">AESGCM</span><span class="p">(</span><span class="n">key</span><span class="p">)</span><span class="o">.</span><span class="n">decrypt</span><span class="p">(</span><span class="n">nonce</span><span class="p">,</span> <span class="n">ct_tag</span><span class="p">,</span> <span class="kc">None</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">&#39;SUCCESS [</span><span class="si">{</span><span class="n">label</span><span class="si">}</span><span class="s1">]: </span><span class="si">{</span><span class="n">pt</span><span class="o">.</span><span class="n">decode</span><span class="p">()</span><span class="si">}</span><span class="s1">&#39;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">&#39;FAIL    [</span><span class="si">{</span><span class="n">label</span><span class="si">}</span><span class="s1">]: </span><span class="si">{</span><span class="n">e</span><span class="si">}</span><span class="s1">&#39;</span><span class="p">)</span>
</span></span></code></pre></div><p>The 24-byte base64-decoded key decrypted successfully. The cookie was correctly decrypted. But still rejected. Which meant the signature check was passing but <em>something else</em> was wrong upstream — it wasn&rsquo;t even getting to the signature.</p>
<p>I went back to the source. <code>session_store.go</code> → <code>NewCookieSessionStore</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="nx">cipher</span><span class="p">,</span><span class="w"> </span><span class="nx">err</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="nx">encryption</span><span class="p">.</span><span class="nf">NewCFBCipher</span><span class="p">(</span><span class="nx">encryption</span><span class="p">.</span><span class="nf">SecretBytes</span><span class="p">(</span><span class="nx">secret</span><span class="p">))</span><span class="w">
</span></span></span></code></pre></div><p><strong>AES-CFB. Not GCM.</strong> The cookie session store uses CFB. GCM exists in the codebase for a different purpose (the Redis ticket store, which I hadn&rsquo;t discovered yet). I had been encrypting with the wrong cipher the entire time.</p>
<p>And <code>SecretBytes</code> — a function I&rsquo;d been reading but not understanding:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="kd">func</span><span class="w"> </span><span class="nf">SecretBytes</span><span class="p">(</span><span class="nx">secret</span><span class="w"> </span><span class="kt">string</span><span class="p">)</span><span class="w"> </span><span class="p">[]</span><span class="kt">byte</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">b</span><span class="p">,</span><span class="w"> </span><span class="nx">err</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="nx">base64</span><span class="p">.</span><span class="nx">RawURLEncoding</span><span class="p">.</span><span class="nf">DecodeString</span><span class="p">(</span><span class="nx">strings</span><span class="p">.</span><span class="nf">TrimRight</span><span class="p">(</span><span class="nx">secret</span><span class="p">,</span><span class="w"> </span><span class="s">&#34;=&#34;</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="nx">err</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="kc">nil</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">for</span><span class="w"> </span><span class="nx">_</span><span class="p">,</span><span class="w"> </span><span class="nx">i</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="k">range</span><span class="w"> </span><span class="p">[]</span><span class="kt">int</span><span class="p">{</span><span class="mi">16</span><span class="p">,</span><span class="w"> </span><span class="mi">24</span><span class="p">,</span><span class="w"> </span><span class="mi">32</span><span class="p">}</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">if</span><span class="w"> </span><span class="nb">len</span><span class="p">(</span><span class="nx">b</span><span class="p">)</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="nx">i</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">return</span><span class="w"> </span><span class="nx">b</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">return</span><span class="w"> </span><span class="p">[]</span><span class="nb">byte</span><span class="p">(</span><span class="nx">secret</span><span class="p">)</span><span class="w">  </span><span class="c1">// fallback: raw string</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The cookie secret <code>q7OF9sK2/Pnt9QKNoBBmxWRL3GAbWzvj</code> contains <code>/</code>. That&rsquo;s valid standard base64 but not URL-safe base64 — <code>RawURLEncoding</code> fails. Fallback to raw string: 32 bytes, valid AES-256 key. My Python test had used standard base64 decoding, which <em>did</em> succeed (and produced a different 24-byte key). My Go implementation had done the same. Both were deriving the wrong key.</p>
<p>I rewrote the cipher to AES-CFB with the raw-string key. New test. Same error. Still rejecting.</p>
<hr>
<h2 id="messagepack-and-lz4">MessagePack and LZ4</h2>
<p>Back to the source. <code>EncodeSessionState</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="kd">func</span><span class="w"> </span><span class="p">(</span><span class="nx">s</span><span class="w"> </span><span class="o">*</span><span class="nx">SessionState</span><span class="p">)</span><span class="w"> </span><span class="nf">EncodeSessionState</span><span class="p">(</span><span class="nx">c</span><span class="w"> </span><span class="nx">encryption</span><span class="p">.</span><span class="nx">Cipher</span><span class="p">,</span><span class="w"> </span><span class="nx">compress</span><span class="w"> </span><span class="kt">bool</span><span class="p">)</span><span class="w"> </span><span class="p">([]</span><span class="kt">byte</span><span class="p">,</span><span class="w"> </span><span class="kt">error</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">packed</span><span class="p">,</span><span class="w"> </span><span class="nx">err</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="nx">msgpack</span><span class="p">.</span><span class="nf">Marshal</span><span class="p">(</span><span class="nx">s</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">compressed</span><span class="p">,</span><span class="w"> </span><span class="nx">err</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="nf">lz4Compress</span><span class="p">(</span><span class="nx">packed</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">return</span><span class="w"> </span><span class="nx">c</span><span class="p">.</span><span class="nf">Encrypt</span><span class="p">(</span><span class="nx">compressed</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><strong>MessagePack. LZ4 compression. Then AES-CFB.</strong></p>
<p>I had been encrypting raw JSON. The whole time.</p>
<p>The struct tags confirmed it:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="kd">type</span><span class="w"> </span><span class="nx">SessionState</span><span class="w"> </span><span class="kd">struct</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">CreatedAt</span><span class="w"> </span><span class="o">*</span><span class="nx">time</span><span class="p">.</span><span class="nx">Time</span><span class="w"> </span><span class="s">`msgpack:&#34;ca,omitempty&#34;`</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">ExpiresOn</span><span class="w"> </span><span class="o">*</span><span class="nx">time</span><span class="p">.</span><span class="nx">Time</span><span class="w"> </span><span class="s">`msgpack:&#34;eo,omitempty&#34;`</span><span class="w">  </span><span class="c1">// &#34;eo&#34;, not &#34;ea&#34; as I&#39;d assumed</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">AccessToken</span><span class="w"> </span><span class="kt">string</span><span class="w">   </span><span class="s">`msgpack:&#34;at,omitempty&#34;`</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">Email</span><span class="w">      </span><span class="kt">string</span><span class="w">    </span><span class="s">`msgpack:&#34;e,omitempty&#34;`</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">User</span><span class="w">       </span><span class="kt">string</span><span class="w">    </span><span class="s">`msgpack:&#34;u,omitempty&#34;`</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Even the ExpiresOn field name was different from what I&rsquo;d guessed (<code>&quot;eo&quot;</code> not <code>&quot;ea&quot;</code>).</p>
<p>I added the <code>vmihailenco/msgpack</code> and <code>pierrec/lz4</code> dependencies, rewrote the encoding pipeline: msgpack → lz4 → AES-CFB(raw-string key) → base64url(encrypted) → sign with HMAC.</p>
<p>Ran the curl test. <strong>HTTP 200.</strong></p>
<p>After three days and four complete rewrites of the encoding logic, oauth2-proxy accepted the forged session.</p>
<hr>
<h2 id="the-access-token-problem">The access token problem</h2>
<p>Celebrating was premature. The browser test worked from curl, but real ForwardAuth requests kept failing intermittently. Looking at the logs:</p>
<pre tabindex="0"><code>Error loading cookied session: session is invalid
</code></pre><p>This came from <code>validateSession</code> in the storedSessionLoader — after successfully loading the session, it was calling the provider&rsquo;s <code>ValidateSession</code> method and getting false back. I checked the GitLab provider:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="kd">func</span><span class="w"> </span><span class="p">(</span><span class="nx">p</span><span class="w"> </span><span class="o">*</span><span class="nx">GitLabProvider</span><span class="p">)</span><span class="w"> </span><span class="nf">ValidateSession</span><span class="p">(</span><span class="nx">ctx</span><span class="w"> </span><span class="nx">context</span><span class="p">.</span><span class="nx">Context</span><span class="p">,</span><span class="w"> </span><span class="nx">s</span><span class="w"> </span><span class="o">*</span><span class="nx">sessions</span><span class="p">.</span><span class="nx">SessionState</span><span class="p">)</span><span class="w"> </span><span class="kt">bool</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">return</span><span class="w"> </span><span class="nf">validateToken</span><span class="p">(</span><span class="nx">ctx</span><span class="p">,</span><span class="w"> </span><span class="nx">p</span><span class="p">,</span><span class="w"> </span><span class="nx">s</span><span class="p">.</span><span class="nx">AccessToken</span><span class="p">,</span><span class="w"> </span><span class="nf">makeOIDCHeader</span><span class="p">(</span><span class="nx">s</span><span class="p">.</span><span class="nx">IDToken</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>oauth2-proxy calls GitLab&rsquo;s <code>/oauth/token/info</code> endpoint with the access token to verify the session is still active. My forged session had an empty <code>AccessToken</code> field. Empty access token → <code>validateToken</code> returns false immediately → session rejected.</p>
<p>The fix: during the phone&rsquo;s GitLab OAuth flow, <code>exchangeCode</code> was already calling GitLab&rsquo;s token endpoint and receiving an access token, but I&rsquo;d been discarding it. I changed the function signature to return it, stored it in the session, included it in the forged session&rsquo;s <code>at</code> field.</p>
<p>The token was issued for my qr-auth GitLab app, not oauth2-proxy&rsquo;s app. But GitLab&rsquo;s <code>/oauth/token/info</code> endpoint doesn&rsquo;t check the issuing application — it just validates the token is active and returns 200. oauth2-proxy only checks for a 200 response. The token worked.</p>
<p>Everything worked.</p>
<hr>
<h2 id="the-end-session-problem--three-attempts">The End Session problem — three attempts</h2>
<h3 id="attempt-1-delete-qr_session-lock-the-qr-page">Attempt 1: Delete qr_session, lock the QR page</h3>
<p>The first End Session implementation deleted the <code>qr_session</code> key from Redis. To make this actually lock the screen, I restored the Homer proxy at <code>qr.hippotion.com</code> — the TV would show Homer via an ExternalName Kubernetes service pointing at the Homer pod, guarded by a Traefik ForwardAuth middleware that checked the <code>qr_session</code> cookie. Homer makes status API calls every ~30 seconds, which re-triggered ForwardAuth, and deleting <code>qr_session</code> meant the screen would lock within 30 seconds automatically.</p>
<p>This worked for <code>qr.hippotion.com</code>, but the <code>_oauth2_proxy</code> cookie was stateless — a signed, self-contained encrypted blob in the browser. There was no server-side record to delete. Other apps (<code>argo.hippotion.com</code>, <code>grafana.hippotion.com</code>, etc.) kept working until the 8-hour cookie expiry.</p>
<p>The TV screen was locked. The session wasn&rsquo;t.</p>
<h3 id="attempt-2-shorter-cookie-ttl">Attempt 2: Shorter cookie TTL</h3>
<p>The tempting quick fix: reduce the forged cookie&rsquo;s TTL from 8 hours to something shorter, like 30 minutes. End Session would lock the TV immediately. Other apps would expire within 30 minutes on their own.</p>
<p>Rejected. 30 minutes of residual access on a shared TV is too long, and the TTL is arbitrary — it doesn&rsquo;t match what End Session is supposed to mean.</p>
<h3 id="attempt-3-redis-backed-oauth2-proxy-sessions">Attempt 3: Redis-backed oauth2-proxy sessions</h3>
<p>The correct fix is what oauth2-proxy calls <em>persistence tickets</em>. Instead of encoding the entire session into the cookie, oauth2-proxy stores the session in Redis and puts only a ticket reference in the cookie. When the ticket is deleted from Redis, the session is gone on the next request.</p>
<p>The ticket format, from <code>pkg/sessions/persistence/ticket.go</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="c1">// ticketID format: &#34;_oauth2_proxy-&lt;hex(16 random bytes)&gt;&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nx">ticketID</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="nx">fmt</span><span class="p">.</span><span class="nf">Sprintf</span><span class="p">(</span><span class="s">&#34;%s-%s&#34;</span><span class="p">,</span><span class="w"> </span><span class="nx">cookieOpts</span><span class="p">.</span><span class="nx">Name</span><span class="p">,</span><span class="w"> </span><span class="nx">hex</span><span class="p">.</span><span class="nf">EncodeToString</span><span class="p">(</span><span class="nx">rawID</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// ticket string in the cookie: &#34;v2.&lt;base64url(ticketID)&gt;.&lt;base64url(ticketSecret)&gt;&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">func</span><span class="w"> </span><span class="p">(</span><span class="nx">t</span><span class="w"> </span><span class="o">*</span><span class="nx">ticket</span><span class="p">)</span><span class="w"> </span><span class="nf">encodeTicket</span><span class="p">()</span><span class="w"> </span><span class="kt">string</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">return</span><span class="w"> </span><span class="nx">fmt</span><span class="p">.</span><span class="nf">Sprintf</span><span class="p">(</span><span class="s">&#34;v2.%s.%s&#34;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nx">base64</span><span class="p">.</span><span class="nx">RawURLEncoding</span><span class="p">.</span><span class="nf">EncodeToString</span><span class="p">([]</span><span class="nb">byte</span><span class="p">(</span><span class="nx">t</span><span class="p">.</span><span class="nx">id</span><span class="p">)),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nx">base64</span><span class="p">.</span><span class="nx">RawURLEncoding</span><span class="p">.</span><span class="nf">EncodeToString</span><span class="p">(</span><span class="nx">t</span><span class="p">.</span><span class="nx">secret</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// session stored in Redis, encrypted with the *ticket* secret (not the cookie secret)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">func</span><span class="w"> </span><span class="p">(</span><span class="nx">t</span><span class="w"> </span><span class="o">*</span><span class="nx">ticket</span><span class="p">)</span><span class="w"> </span><span class="nf">saveSession</span><span class="p">(</span><span class="nx">s</span><span class="w"> </span><span class="o">*</span><span class="nx">sessions</span><span class="p">.</span><span class="nx">SessionState</span><span class="p">,</span><span class="w"> </span><span class="nx">saver</span><span class="w"> </span><span class="nx">saveFunc</span><span class="p">)</span><span class="w"> </span><span class="kt">error</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">c</span><span class="p">,</span><span class="w"> </span><span class="nx">err</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="nx">encryption</span><span class="p">.</span><span class="nf">NewGCMCipher</span><span class="p">(</span><span class="nx">t</span><span class="p">.</span><span class="nx">secret</span><span class="p">)</span><span class="w">  </span><span class="c1">// GCM, not CFB</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">ciphertext</span><span class="p">,</span><span class="w"> </span><span class="nx">err</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="nx">s</span><span class="p">.</span><span class="nf">EncodeSessionState</span><span class="p">(</span><span class="nx">c</span><span class="p">,</span><span class="w"> </span><span class="kc">false</span><span class="p">)</span><span class="w">  </span><span class="c1">// msgpack, NO lz4</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">return</span><span class="w"> </span><span class="nf">saver</span><span class="p">(</span><span class="nx">t</span><span class="p">.</span><span class="nx">id</span><span class="p">,</span><span class="w"> </span><span class="nx">ciphertext</span><span class="p">,</span><span class="w"> </span><span class="nx">t</span><span class="p">.</span><span class="nx">options</span><span class="p">.</span><span class="nx">Expire</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This is a completely different format from the cookie session:</p>
<table>
	<thead>
			<tr>
					<th></th>
					<th>Cookie session</th>
					<th>Redis session (ticket)</th>
			</tr>
	</thead>
	<tbody>
			<tr>
					<td>Cipher</td>
					<td>AES-CFB</td>
					<td>AES-128-GCM</td>
			</tr>
			<tr>
					<td>Key</td>
					<td>cookie secret (raw string)</td>
					<td>per-session ticket secret</td>
			</tr>
			<tr>
					<td>Serialization</td>
					<td>msgpack</td>
					<td>msgpack</td>
			</tr>
			<tr>
					<td>Compression</td>
					<td>lz4</td>
					<td><strong>none</strong></td>
			</tr>
			<tr>
					<td>Storage</td>
					<td>in the cookie</td>
					<td>Redis, keyed by ticket ID</td>
			</tr>
			<tr>
					<td>Revocable</td>
					<td>no</td>
					<td>yes</td>
			</tr>
	</tbody>
</table>
<p>I rewrote the session creation to generate a random ticket ID and secret, encrypt the msgpack session with AES-GCM using the ticket secret, store it in Redis, and set the signed ticket reference as the <code>_oauth2_proxy</code> cookie.</p>
<p>I stored the ticket ID alongside the <code>qr_session</code> in Redis:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;email&#34;</span><span class="p">:</span> <span class="s2">&#34;user@example.com&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;username&#34;</span><span class="p">:</span> <span class="s2">&#34;username&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;access_token&#34;</span><span class="p">:</span> <span class="s2">&#34;...&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;oauth2_ticket_id&#34;</span><span class="p">:</span> <span class="s2">&#34;_oauth2_proxy-eeeb18501625dee77f344c0a6193d0bc&#34;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></div><p>End Session now does two Redis deletes:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="kd">func</span><span class="w"> </span><span class="nf">handleLogout</span><span class="p">(</span><span class="nx">w</span><span class="w"> </span><span class="nx">http</span><span class="p">.</span><span class="nx">ResponseWriter</span><span class="p">,</span><span class="w"> </span><span class="nx">r</span><span class="w"> </span><span class="o">*</span><span class="nx">http</span><span class="p">.</span><span class="nx">Request</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">sessionID</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="nx">r</span><span class="p">.</span><span class="nf">FormValue</span><span class="p">(</span><span class="s">&#34;session_id&#34;</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">ctx</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="nx">r</span><span class="p">.</span><span class="nf">Context</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="nx">raw</span><span class="p">,</span><span class="w"> </span><span class="nx">err</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="nx">rdb</span><span class="p">.</span><span class="nf">Get</span><span class="p">(</span><span class="nx">ctx</span><span class="p">,</span><span class="w"> </span><span class="s">&#34;session:&#34;</span><span class="o">+</span><span class="nx">sessionID</span><span class="p">).</span><span class="nf">Result</span><span class="p">();</span><span class="w"> </span><span class="nx">err</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="kc">nil</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">var</span><span class="w"> </span><span class="nx">sd</span><span class="w"> </span><span class="nx">sessionData</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="nx">json</span><span class="p">.</span><span class="nf">Unmarshal</span><span class="p">([]</span><span class="nb">byte</span><span class="p">(</span><span class="nx">raw</span><span class="p">),</span><span class="w"> </span><span class="o">&amp;</span><span class="nx">sd</span><span class="p">)</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="kc">nil</span><span class="w"> </span><span class="o">&amp;&amp;</span><span class="w"> </span><span class="nx">sd</span><span class="p">.</span><span class="nx">OAuth2TicketID</span><span class="w"> </span><span class="o">!=</span><span class="w"> </span><span class="s">&#34;&#34;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nx">rdb</span><span class="p">.</span><span class="nf">Del</span><span class="p">(</span><span class="nx">ctx</span><span class="p">,</span><span class="w"> </span><span class="nx">sd</span><span class="p">.</span><span class="nx">OAuth2TicketID</span><span class="p">)</span><span class="w">  </span><span class="c1">// kills oauth2-proxy session</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nx">rdb</span><span class="p">.</span><span class="nf">Del</span><span class="p">(</span><span class="nx">ctx</span><span class="p">,</span><span class="w"> </span><span class="s">&#34;session:&#34;</span><span class="o">+</span><span class="nx">sessionID</span><span class="p">)</span><span class="w">  </span><span class="c1">// kills qr session</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>I configured oauth2-proxy to use Redis session storage pointing at the same Redis instance, added the Cilium network policy to allow ingress from the oauth2-proxy namespace, and removed the Homer proxy from <code>qr.hippotion.com</code> — it was no longer needed.</p>
<p>One final gotcha: <code>session_store_type = &quot;redis&quot;</code> in oauth2-proxy&rsquo;s legacy config file does nothing. There&rsquo;s no error, no warning. It silently ignores the option. The flag only works when passed as an actual CLI argument via <code>extraArgs</code> in the Helm chart values:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">extraArgs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">session-store-type</span><span class="p">:</span><span class="w"> </span><span class="l">redis</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">redis-connection-url</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;redis://qr-auth-redis:6379&#34;</span><span class="w">
</span></span></span></code></pre></div><p>After that, End Session worked correctly. Phone taps the button, ticket is deleted from Redis, the next ForwardAuth request for any app on the domain immediately redirects to the QR lock screen.</p>
<hr>
<h2 id="what-the-final-architecture-looks-like">What the final architecture looks like</h2>
<pre tabindex="0"><code>Phone: scan QR
  → /device?token=xxx → intermediate page (&#34;Continue with GitLab&#34;)
  → GitLab OAuth on phone (already logged in → direct callback)
  → /callback: exchange code → get email + access token
  → create Redis ticket: AES-128-GCM(msgpack(session), ticketSecret)
  → store ticket in Redis at &#34;_oauth2_proxy-&lt;hex&gt;&#34;
  → mark device token as authed, store ticketID in qr session

TV: poll fires
  → read qr session from Redis (has email, accessToken, ticketID)
  → set _oauth2_proxy cookie: signed ticket reference
  → set qr_session cookie
  → redirect to home.hippotion.com

Any protected app (home, argo, grafana, ...):
  → Traefik ForwardAuth → oauth2-proxy
  → oauth2-proxy reads _oauth2_proxy cookie → decodes ticket
  → looks up &#34;_oauth2_proxy-&lt;hex&gt;&#34; in Redis → decrypts session
  → validates email, access token → 200 OK

Phone: &#34;End Session&#34;
  → POST /logout with session_id
  → delete &#34;session:&lt;id&gt;&#34; from Redis (qr session gone)
  → delete &#34;_oauth2_proxy-&lt;hex&gt;&#34; from Redis (oauth2 ticket gone)
  → next ForwardAuth on TV: Redis lookup fails → redirect to login
</code></pre><p>The intermediate page on the phone (&ldquo;Continue with GitLab&rdquo; button instead of auto-redirect) was an unexpected requirement. Mobile browsers opened by the camera app often don&rsquo;t share sessions with the browser where GitLab is logged in. When you auto-redirect to GitLab in a browser with no existing session, GitLab redirects to the sign-in page. The OAuth state is stored in a session cookie that GitLab sets during the initial authorize request. On mobile, the sign-in form submission can lose this cookie due to SameSite restrictions — after sign-in, GitLab can&rsquo;t resume the OAuth flow and falls back to <code>/users/sign_in</code> with no further redirect. The intermediate page gives the user a visible moment to confirm they&rsquo;re in a browser with an active GitLab session before initiating the OAuth redirect.</p>
<hr>
<h2 id="lessons">Lessons</h2>
<p><strong>Read the source, not the docs.</strong> The docs say &ldquo;AES encryption&rdquo; without specifying the mode or how the key is derived. The source has the answer in twenty lines.</p>
<p><strong>Test at the boundary.</strong> The curl test against the ForwardAuth endpoint was the most valuable debugging step. It isolated exactly which layer was failing and gave me the real error message instead of a browser redirect loop. Without it, I&rsquo;d still be guessing.</p>
<p><strong>Format assumptions are fragile.</strong> I assumed JSON because JSON is the default for everything. oauth2-proxy uses MessagePack because it produces smaller cookies. LZ4 because it decompresses fast. AES-CFB because that&rsquo;s what was chosen when the code was written. None of this is unreasonable, but none of it is obvious from the outside.</p>
<p><strong>Two formats, same codebase.</strong> Cookie sessions and Redis ticket sessions use different ciphers, different compression, different key derivation. The GCM cipher I found first is correct — but for Redis sessions, not cookie sessions. The CFB cipher is for cookie sessions. I had the right code in the wrong place.</p>
<p><strong>Config files can silently ignore options.</strong> <code>session_store_type = &quot;redis&quot;</code> in oauth2-proxy&rsquo;s legacy config file does nothing. <code>--session-store-type=redis</code> on the command line works. No error, no warning, no indication that the option was parsed but not applied.</p>
<p><strong>Revocability requires server-side state.</strong> A self-contained encrypted cookie cannot be revoked without adding a denylist (which has its own scaling problems). If you need End Session to mean something, you need a server-side session store. oauth2-proxy supports Redis sessions precisely for this reason — the ticket design is clean and the revocation path is a single Redis delete.</p>
<p>The code is at <a href="https://github.com/janos-gyorgy/qr-device-login">github.com/janos-gyorgy/qr-device-login</a>.</p>
]]></content:encoded></item></channel></rss>