<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Cilium on hippotion</title><link>https://blog.hippotion.com/tags/cilium/</link><description>Recent content in Cilium on hippotion</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Fri, 27 Feb 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://blog.hippotion.com/tags/cilium/index.xml" rel="self" type="application/rss+xml"/><item><title>🌱 My Second Brain Weeds Itself Now</title><link>https://blog.hippotion.com/posts/an-ai-gardener-for-your-second-brain/</link><pubDate>Fri, 27 Feb 2026 00:00:00 +0000</pubDate><guid>https://blog.hippotion.com/posts/an-ai-gardener-for-your-second-brain/</guid><description>I gave my markdown knowledge base a nightly gardener — an AI that finds orphan notes and missing links and fixes them, every change a reviewable git commit. The fun part was the Kubernetes wall I hit on the way.</description><content:encoded><![CDATA[<p>A few weeks ago I <a href="/posts/a-second-brain-you-can-git-clone/">rebuilt my second brain as a folder of markdown in git</a> — vault is the source of truth, everything else (search index, graph, 3D viewer) is a derived layer I can delete and rebuild. I love it. But a knowledge base has a dirty secret: <strong>it rots.</strong></p>
<p>Not the files — those are fine. The <em>connections</em> rot. You capture a note at 11pm and never link it to anything, so it becomes an orphan floating off the graph. A project note&rsquo;s one-line summary describes what the project was three weeks ago. Two notes are obviously about the same thing and neither knows the other exists. Do this for a few months and you don&rsquo;t have a second brain, you have a junk drawer with good search.</p>
<p>The honest fix is to weed the garden regularly. The honest truth is that nobody does, including me.</p>
<p>So I stopped relying on myself and built a gardener.</p>
<h2 id="what-it-actually-does">What it actually does</h2>
<p>Every night at 3am, on my homelab box, a script runs:</p>
<ol>
<li><strong>Detect</strong> — <code>exo garden</code>, a plain query over the index, produces a report: here are the orphans, here are notes that should probably link to each other, here are summaries that look stale. <strong>No AI in this step.</strong> It&rsquo;s SQL and graph traversal. Deterministic, boring, trustworthy.</li>
<li><strong>Decide and write</strong> — that report gets piped to <code>claude -p</code> (Claude Code in headless mode). Claude reads the vault&rsquo;s operating contract, makes <em>only high-confidence</em> edits — add a <code>[[wikilink]]</code> between two genuinely related notes, refresh a stale summary — caps itself at ~10 notes a night, and writes a dated log note explaining exactly what it changed and what it deliberately skipped.</li>
<li><strong>Commit</strong> — the wrapper reindexes and lands everything as a single <code>garden: 2026-06-09 …</code> git commit, then pushes. My 3D graph viewer picks it up on the next sync.</li>
</ol>
<p>The first real run, it found one orphan (<code>90-meta/README</code>), linked it into the notes it actually indexes, and then — this is the part I liked — <em>declined</em> to touch the 12 &ldquo;stale summary&rdquo; candidates because, on inspection, every one of them was already accurate. It wrote: <em>&ldquo;flagged by length, not staleness; churning them would add noise.&rdquo;</em> A gardener that knows when <strong>not</strong> to prune is the one you can leave alone.</p>
<h2 id="isnt-this-a-solved-problem">&ldquo;Isn&rsquo;t this a solved problem?&rdquo;</h2>
<p>Mostly, no — but partly, yes, and I want to be straight about it. AI-assisted note-linking exists: Obsidian plugins like Smart Connections suggest related notes, and apps like Mem and Reflect auto-organize as you write. They&rsquo;re good.</p>
<p>Three things make this different enough to build:</p>
<ul>
<li><strong>Every change is a reviewable git diff, authored by a named agent.</strong> Not silent magic that rearranges your notes while you&rsquo;re not looking. <code>git log -p</code> shows you exactly what the gardener did last night; <code>git revert</code> undoes a bad night in one command. For something as personal as a knowledge base, &ldquo;show me the diff&rdquo; beats &ldquo;trust me.&rdquo;</li>
<li><strong>It&rsquo;s mine, end to end.</strong> Runs on my hardware, on my schedule, with a model I point at. No SaaS holds my brain hostage.</li>
<li><strong>The detection is deterministic; the model only acts.</strong> The LLM never decides <em>what&rsquo;s wrong</em> — a boring query does that. The model only decides <em>how to fix the things already found</em>. That split keeps the whole thing auditable and cheap.</li>
</ul>
<p>If you already live in a tool that does this and you trust it, great. I wanted the git-diff trail and the local control.</p>
<h2 id="the-part-i-actually-want-to-tell-you-about">The part I actually want to tell you about</h2>
<p>The plan was tidy: I run n8n on the same cluster, so n8n would be the scheduler — fire nightly, <strong>SSH into the node</strong>, run the gardener. Clean, visual, one workflow.</p>
<p>n8n could not reach the node. At all. Every port: <code>ECONNREFUSED</code>.</p>
<p>This sent me down a genuinely interesting hole, because the homelab runs <strong>Cilium</strong> for networking, and Cilium has opinions about your own node that plain Kubernetes does not.</p>
<p>First instinct: a NetworkPolicy allowing egress to the node&rsquo;s IP. Wrote it, synced it, still refused. The reason is a Cilium subtlety worth knowing: <strong>the node isn&rsquo;t a CIDR, it&rsquo;s an identity.</strong> Cilium classifies your cluster&rsquo;s own node as the special <code>host</code> identity, and ordinary <code>ipBlock</code> CIDR rules <em>do not match it</em> unless you flip a cluster-wide setting (<code>policy-cidr-match-mode: nodes</code>). My <code>192.168.0.109/32</code> rule was a no-op.</p>
<p>So I switched to the Cilium-native tool: a <code>CiliumNetworkPolicy</code> with <code>toEntities: [host]</code>. Confirmed it applied — I could see <code>reserved:host</code> allowed right there in the datapath&rsquo;s BPF policy map. I confirmed the node&rsquo;s IP really does resolve to identity <code>1</code> (host). I confirmed the host firewall was <em>disabled</em>. Everything said &ldquo;allowed.&rdquo;</p>
<p>Still <code>ECONNREFUSED</code>.</p>
<p>That&rsquo;s the wall. The packet leaves the pod with Cilium&rsquo;s blessing, hits the host&rsquo;s own network stack, and <em>something there</em> sends a reset — and I couldn&rsquo;t see what, because inspecting the host firewall needs root, and this automation deliberately doesn&rsquo;t have it. I could have kept digging with a password. But I stopped and asked a better question: <strong>why am I making a pod reach back into the host it&rsquo;s running on at all?</strong></p>
<p>That&rsquo;s an awkward direction. The work has to happen <em>on</em> the host (that&rsquo;s where the vault, git creds, and Claude live). A pod straining to SSH into its own node is fighting the grain of the platform.</p>
<p>So I inverted it. <strong>The node schedules itself</strong> — a plain cron entry, rock-solid, no network gymnastics. And n8n, instead of <em>triggering</em> the job, <em>receives</em> it: at the end of each run the node POSTs a summary to an n8n webhook. Node→n8n works perfectly (it&rsquo;s just an outbound HTTPS call to a URL). n8n keeps the run history and is the place I&rsquo;ll later wire a phone notification.</p>
<p>I lost nothing that mattered. n8n is still my dashboard; the schedule just lives where the work lives. And I deleted the SSH key and the network-policy hole I&rsquo;d opened — the cleanup felt better than the original plan would have.</p>
<h2 id="the-lesson-such-as-it-is">The lesson, such as it is</h2>
<p>Two, actually.</p>
<p><strong>One:</strong> when you&rsquo;re automating something to run unattended, the bug you want to find is the one that shows up in a <em>dry run at 2pm</em>, not at <em>3am three weeks from now</em>. I almost shipped a version where a brand-new note (untracked by git) was invisible to my change-detection and would&rsquo;ve been silently wiped each night. The dry run caught it. Always build the dry run.</p>
<p><strong>Two, the bigger one:</strong> I spent an hour trying to make a pod punch into its host because that was <em>my</em> plan, and the platform kept saying no in increasingly specific ways. The fix wasn&rsquo;t a cleverer NetworkPolicy. It was noticing I was pushing against the design and turning around. The node scheduling itself and <em>reporting up</em> to n8n is simpler, safer, and more honest about where the work actually lives.</p>
<p>My brain weeds itself now. Every morning there&rsquo;s maybe one small, sensible commit waiting — a link I&rsquo;d have never made, a summary nudged back to true — and I can read exactly what changed before my coffee&rsquo;s done. That&rsquo;s the whole dream of a second brain that isn&rsquo;t a junk drawer: it stays a garden, and I barely have to touch it.</p>
]]></content:encoded></item><item><title>🧱 How Do You Isolate Two n8n Tenants on Kubernetes — and Prove Each Wall Holds?</title><link>https://blog.hippotion.com/posts/n8n-multitenant/</link><pubDate>Fri, 19 Dec 2025 00:00:00 +0000</pubDate><guid>https://blog.hippotion.com/posts/n8n-multitenant/</guid><description>Multi-tenant isolation is easy to assert and hard to verify. Three walls — network, secret, resource — and the actual 403s, timeouts, and admission rejections that prove each one holds.</description><content:encoded><![CDATA[<h2 id="the-question">The question</h2>
<p><em>&ldquo;You&rsquo;re running n8n for multiple customers on the same Kubernetes cluster. What stops Customer A from reading Customer B&rsquo;s API keys, calling Customer B&rsquo;s services, or starving Customer B&rsquo;s workflows by burning the whole node?&rdquo;</em></p>
<p>Three different walls, three different mechanisms. Most articles I&rsquo;ve read on K8s multi-tenancy list the primitives — namespaces, NetworkPolicies, ResourceQuotas, RBAC — without showing what each one actually catches when you try to cross it. This post does the second part. The receipts are the point.</p>
<p>The setup: two namespaces, <code>web-tenant-acme</code> and <code>web-tenant-globex</code>, each running their own n8n instance on the same node. The only thing keeping them apart is the walls we build around each namespace.</p>
<hr>
<h2 id="the-mental-model-subtractive-isolation">The mental model: subtractive isolation</h2>
<p>Kubernetes is a flat network with shared everything by default. You don&rsquo;t <em>add</em> isolation by writing allow rules. You <em>subtract</em> trust by adding default-deny rules, and then carefully allow back only the connections each tenant actually needs.</p>
<p>A tenant doesn&rsquo;t have access to another tenant because there is <em>no rule allowing it</em>. The absence of an allow rule is the wall.</p>
<p>Three of these absences make up the picture:</p>
<table>
	<thead>
			<tr>
					<th>Wall</th>
					<th>Primitive</th>
					<th>Failure mode when crossed</th>
			</tr>
	</thead>
	<tbody>
			<tr>
					<td>Network</td>
					<td>Cilium NetworkPolicy, default-deny egress</td>
					<td>Connection times out (silent drop)</td>
			</tr>
			<tr>
					<td>Secret</td>
					<td>Vault Kubernetes-auth, per-tenant policy</td>
					<td><code>403 permission denied</code> from Vault itself</td>
			</tr>
			<tr>
					<td>Resource</td>
					<td>ResourceQuota + LimitRange</td>
					<td>Pod rejected at admission time</td>
			</tr>
	</tbody>
</table>
<p>Different layers, different error messages. That&rsquo;s how you can tell what stopped you.</p>
<hr>
<h2 id="wall-1--network-cilium-networkpolicy">Wall 1 — Network: Cilium NetworkPolicy</h2>
<p>n8n in <code>web-tenant-acme</code> can reach <code>whoami.web-tenant-acme.svc.cluster.local</code> (its own service in its own namespace) but not <code>whoami.web-tenant-globex.svc.cluster.local</code>. The same DNS shape, the same cluster, the same node. One succeeds, the other hangs.</p>
<p>The primitive is a default-deny egress policy applied to every pod in the namespace, with two narrow exceptions: intra-namespace traffic (so n8n can still reach its own service) and DNS to <code>kube-system</code> (otherwise nothing resolves anything).</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># Effective policy on every pod in web-tenant-acme:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">podSelector</span><span class="p">:</span><span class="w"> </span>{}<span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">policyTypes</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="l">Egress, Ingress]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">egress</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">to</span><span class="p">:</span><span class="w">                                     </span><span class="c"># intra-namespace traffic OK</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">podSelector</span><span class="p">:</span><span class="w"> </span>{}<span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">to</span><span class="p">:</span><span class="w">                                     </span><span class="c"># DNS to kube-dns OK</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">namespaceSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">kubernetes.io/metadata.name</span><span class="p">:</span><span class="w"> </span><span class="l">kube-system</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">ports</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>{<span class="nt">port: 53, protocol</span><span class="p">:</span><span class="w"> </span><span class="l">UDP}]</span><span class="w">
</span></span></span></code></pre></div><p>There is no rule for <code>web-tenant-globex</code>. Cilium&rsquo;s eBPF datapath drops the SYN packet on the way out.</p>
<p><strong>The receipt</strong> — an n8n HTTP node configured to GET <code>http://whoami.web-tenant-globex.svc.cluster.local/</code>. It hangs for the full timeout, then errors with <code>AxiosError: timeout of 5000ms exceeded</code> / <code>code: ECONNABORTED</code>.</p>
<p>The interesting bit: <strong>DNS still works.</strong> kube-dns is allowed, so the cross-namespace Service still resolves. The TCP handshake is what gets dropped. That&rsquo;s a useful signal in real incident response — &ldquo;DNS resolves but the connection hangs&rdquo; almost always means a NetworkPolicy is the cause.</p>
<hr>
<h2 id="wall-2--secret-vault-kubernetes-auth--eso">Wall 2 — Secret: Vault Kubernetes-auth + ESO</h2>
<p>Now imagine Acme&rsquo;s n8n misbehaves: somebody pushes a workflow that tries to read Globex&rsquo;s API keys via an <code>ExternalSecret</code>. The network isn&rsquo;t the issue — both tenants need to reach Vault, so they both have an egress rule for <code>sys-vault</code>. The wall has to be at the identity layer.</p>
<p>Each tenant gets three things:</p>
<ol>
<li>A dedicated <code>ServiceAccount</code> (<code>n8n-acme</code>, <code>n8n-globex</code>).</li>
<li>A Vault Kubernetes-auth <code>role</code> bound to that SA in that namespace, mapped to a Vault <code>policy</code> that grants <code>read</code> on <em>only its own</em> KV path.</li>
<li>A namespaced External Secrets <code>SecretStore</code> that authenticates as the SA via the Kubernetes TokenRequest API.</li>
</ol>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="cl"><span class="c1"># Vault policy: tenant-acme can read its own secrets, nothing else.
</span></span></span><span class="line"><span class="cl"><span class="n">path &#34;secret/data/web-tenant-acme&#34;     { capabilities</span> <span class="o">=</span> <span class="p">[</span><span class="s2">&#34;read&#34;</span><span class="p">]</span> }
</span></span><span class="line"><span class="cl"><span class="n">path &#34;secret/metadata/web-tenant-acme&#34; { capabilities</span> <span class="o">=</span> <span class="p">[</span><span class="s2">&#34;read&#34;</span><span class="p">]</span> }
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">vault write auth/kubernetes/role/tenant-acme <span class="se">\
</span></span></span><span class="line"><span class="cl">  <span class="nv">bound_service_account_names</span><span class="o">=</span>n8n-acme <span class="se">\
</span></span></span><span class="line"><span class="cl">  <span class="nv">bound_service_account_namespaces</span><span class="o">=</span>web-tenant-acme <span class="se">\
</span></span></span><span class="line"><span class="cl">  <span class="nv">policies</span><span class="o">=</span>tenant-acme <span class="se">\
</span></span></span><span class="line"><span class="cl">  <span class="nv">ttl</span><span class="o">=</span>1h
</span></span></code></pre></div><p>When Acme&rsquo;s n8n tries an <code>ExternalSecret</code> pointing at <code>secret/web-tenant-globex/...</code>, ESO authenticates fine (the SA is valid), Vault recognises the caller, looks up the <code>tenant-acme</code> policy, and answers with the most satisfying line in this whole demo:</p>
<pre tabindex="0"><code>URL: GET http://sys-vault.sys-vault.svc.cluster.local:8200/v1/secret/data/web-tenant-globex
Code: 403. Errors:
* permission denied
</code></pre><p>This is the bit that separates &ldquo;namespace isolation&rdquo; from real multi-tenant secret isolation. Plain Kubernetes Secrets + RBAC stop a tenant from <em>listing</em> another tenant&rsquo;s Secret objects, but the moment you go upstream — to Vault, to a cloud KMS, to an SSM Parameter Store — the secret store needs to enforce identity itself. The network said yes; the secret store still says no.</p>
<hr>
<h2 id="wall-3--resource-resourcequota--limitrange">Wall 3 — Resource: ResourceQuota + LimitRange</h2>
<p>The third concern is the noisy neighbour: Acme&rsquo;s runaway workflow allocating a 4Gi pod and OOM-killing everything else on the node. The network policy doesn&rsquo;t catch this (no network call), and Vault doesn&rsquo;t catch this (no secret request). The kernel will, <em>eventually</em> — but you don&rsquo;t want eventually. You want admission-time rejection.</p>
<p>Two primitives:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ResourceQuota</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w"> </span>{<span class="w"> </span><span class="nt">name: tenant-quota, namespace</span><span class="p">:</span><span class="w"> </span><span class="l">web-tenant-acme }</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">hard</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">requests.cpu</span><span class="p">:</span><span class="w">    </span><span class="s2">&#34;1&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">requests.memory</span><span class="p">:</span><span class="w"> </span><span class="l">1Gi</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">limits.cpu</span><span class="p">:</span><span class="w">      </span><span class="s2">&#34;2&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">limits.memory</span><span class="p">:</span><span class="w">   </span><span class="l">2Gi</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">pods</span><span class="p">:</span><span class="w">            </span><span class="s2">&#34;10&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">LimitRange</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w"> </span>{<span class="w"> </span><span class="nt">name: tenant-limits, namespace</span><span class="p">:</span><span class="w"> </span><span class="l">web-tenant-acme }</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">limits</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">Container</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">default</span><span class="p">:</span><span class="w">        </span>{<span class="w"> </span><span class="nt">cpu: 500m, memory</span><span class="p">:</span><span class="w"> </span><span class="l">512Mi }</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">defaultRequest</span><span class="p">:</span><span class="w"> </span>{<span class="w"> </span><span class="nt">cpu: 50m,  memory</span><span class="p">:</span><span class="w"> </span><span class="l">128Mi }</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">max</span><span class="p">:</span><span class="w">            </span>{<span class="w"> </span><span class="nt">cpu</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;2&#34;</span><span class="nt">,  memory</span><span class="p">:</span><span class="w"> </span><span class="l">1Gi }</span><span class="w">
</span></span></span></code></pre></div><p><code>ResourceQuota</code> caps the namespace total. <code>LimitRange</code> bounds any <em>individual</em> container and supplies defaults so pods that don&rsquo;t declare requests/limits still get reasonable ones — important because a missing limit on a single container can blow past the quota in one allocation.</p>
<p><strong>The receipt</strong> — a server-side dry-run of a single 4Gi pod, which never gets created:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ kubectl apply -n web-tenant-acme --dry-run<span class="o">=</span>server -f noisy-neighbor.yaml
</span></span><span class="line"><span class="cl">Error from server <span class="o">(</span>Forbidden<span class="o">)</span>: error when creating <span class="s2">&#34;STDIN&#34;</span>:
</span></span><span class="line"><span class="cl">pods <span class="s2">&#34;noisy-neighbor&#34;</span> is forbidden:
</span></span><span class="line"><span class="cl">  maximum memory usage per Container is 1Gi, but limit is 4Gi
</span></span></code></pre></div><p>Not a kernel OOMKill. Not a pod stuck in <code>Pending</code>. A flat refusal from the API server before the scheduler even sees the request.</p>
<hr>
<h2 id="what-this-does-not-prove">What this does <em>not</em> prove</h2>
<p>A homelab demo on one node with two synthetic tenants is not n8n Cloud. The honest gaps:</p>
<ul>
<li><strong>Execution sandboxing.</strong> A workflow can still run arbitrary code via the <code>Code</code> node or shell-outs. These walls stop <em>infrastructure</em> leakage; they don&rsquo;t sandbox what n8n itself executes. Real n8n Cloud needs more than namespace walls for that — gVisor / Firecracker / per-tenant worker pools are the usual answers, and n8n&rsquo;s <a href="https://docs.n8n.io/hosting/scaling/queue-mode/">queue mode</a> lends itself to the last.</li>
<li><strong>Pooled worker queues.</strong> Queue mode runs main/webhook/worker as separate deployments backed by Redis + Postgres. Two tenants sharing a worker pool need additional checks at the job-routing layer to keep workflows from accessing the wrong tenant&rsquo;s binary data. Out of scope for the homelab demo.</li>
<li><strong>Control plane.</strong> Both tenants reach the same API server. A cluster-admin-equivalent compromise breaks everything. This is the assumption every shared K8s setup makes.</li>
<li><strong>Node-level.</strong> Same kernel. Container escape, CPU side channels, the usual list — all apply. For paranoid tenants the answer is dedicated nodes via taints/tolerations or separate clusters entirely.</li>
</ul>
<p>The demo proves the <em>namespace-shaped</em> walls hold. It does not prove the whole stack is safe against a determined attacker already running code inside a tenant. That&rsquo;s a different post.</p>
<hr>
<p><em>Part of a Kubernetes-on-the-homelab series — previously: <a href="/posts/k8s-network-isolation/">preventing a compromised pod from calling your database</a>, <a href="/posts/k8s-gitops-secrets/">GitOps secrets</a>.</em></p>
]]></content:encoded></item><item><title>🛡️ How Do You Prevent a Compromised Pod From Calling Your Database?</title><link>https://blog.hippotion.com/posts/k8s-network-isolation/</link><pubDate>Fri, 23 May 2025 00:00:00 +0000</pubDate><guid>https://blog.hippotion.com/posts/k8s-network-isolation/</guid><description>Default Kubernetes is a flat network. Every pod can reach every other pod. In a cluster with ten services, that&amp;rsquo;s ten potential blast radiuses instead of one.</description><content:encoded><![CDATA[<h2 id="the-question">The question</h2>
<p><em>&ldquo;How do you enforce network isolation between services in a Kubernetes cluster?&rdquo;</em></p>
<p>The default Kubernetes network model is flat. Every pod can reach every other pod, in any namespace, on any port. There are no firewalls, no ACLs, no segmentation. A compromised frontend pod can connect directly to your PostgreSQL port, your Redis port, your internal admin API, and every other service in the cluster.</p>
<p>This is intentional — Kubernetes doesn&rsquo;t assume you want isolation, because not everyone does. But if you do want it, you need to add it.</p>
<hr>
<h2 id="networkpolicy-the-primitive">NetworkPolicy: the primitive</h2>
<p>A <code>NetworkPolicy</code> is a Kubernetes resource that selects a set of pods and defines what traffic is allowed to reach them (ingress) and what traffic they&rsquo;re allowed to send (egress). Traffic that isn&rsquo;t explicitly allowed is dropped.</p>
<p>The catch: <code>NetworkPolicy</code> resources have no effect unless your CNI plugin supports them. The default k3s CNI (Flannel) does not. Calico, Cilium, and Canal do. If you&rsquo;re running Flannel and you apply a NetworkPolicy, it will be silently ignored — no error, no warning.</p>
<hr>
<h2 id="the-default-deny-pattern">The default-deny pattern</h2>
<p>The correct starting point is a default-deny policy that blocks everything, applied to the namespace. You then add explicit allow policies for the traffic you actually need.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># Block all ingress and egress in this namespace by default</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">networking.k8s.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">NetworkPolicy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">default-deny-all</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">myapp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">podSelector</span><span class="p">:</span><span class="w"> </span>{}<span class="w">        </span><span class="c"># matches all pods in the namespace</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">policyTypes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">Ingress</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">Egress</span><span class="w">
</span></span></span></code></pre></div><p>With this in place, your pods can&rsquo;t receive traffic and can&rsquo;t send traffic. You then add back what you need.</p>
<hr>
<h2 id="allowing-specific-traffic">Allowing specific traffic</h2>
<p>Allow the web frontend to receive traffic from the ingress controller:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">networking.k8s.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">NetworkPolicy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">allow-ingress-from-traefik</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">myapp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">podSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l">frontend</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">policyTypes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">Ingress</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">ingress</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">from</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">namespaceSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">kubernetes.io/metadata.name</span><span class="p">:</span><span class="w"> </span><span class="l">sys-traefik</span><span class="w">
</span></span></span></code></pre></div><p>Allow the backend to talk to PostgreSQL:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">networking.k8s.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">NetworkPolicy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">allow-egress-to-postgres</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">myapp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">podSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l">backend</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">policyTypes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">Egress</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">egress</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">to</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">podSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l">postgres</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">5432</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l">TCP</span><span class="w">
</span></span></span></code></pre></div><p>After these two policies: the frontend receives traffic from Traefik, and the backend can reach Postgres. The frontend cannot reach Postgres. The backend cannot receive traffic from the ingress controller. Neither can call anything else.</p>
<hr>
<h2 id="the-dns-gotcha">The DNS gotcha</h2>
<p>Once you add a default-deny egress policy, DNS stops working. Your pods can no longer resolve service names because they can&rsquo;t reach <code>kube-dns</code> in the <code>kube-system</code> namespace.</p>
<p>You need to explicitly allow it:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">networking.k8s.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">NetworkPolicy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">allow-egress-dns</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">myapp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">podSelector</span><span class="p">:</span><span class="w"> </span>{}<span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">policyTypes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">Egress</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">egress</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">to</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">namespaceSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">kubernetes.io/metadata.name</span><span class="p">:</span><span class="w"> </span><span class="l">kube-system</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">53</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l">UDP</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">53</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l">TCP</span><span class="w">
</span></span></span></code></pre></div><p>Missing this is the most common reason &ldquo;everything broke after I added NetworkPolicies&rdquo;. Add it to every namespace that has a default-deny policy.</p>
<hr>
<h2 id="cilium-the-same-model-with-more-power">Cilium: the same model with more power</h2>
<p>Cilium implements the standard <code>NetworkPolicy</code> API and adds its own <code>CiliumNetworkPolicy</code> CRD with L7 capabilities.</p>
<p>Standard NetworkPolicy works at L3/L4 — IP addresses and ports. Cilium&rsquo;s CRD adds:</p>
<p><strong>L7 HTTP filtering</strong>: allow specific HTTP methods and paths, not just port 8080.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">cilium.io/v2</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">CiliumNetworkPolicy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">allow-api-reads</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">myapp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">endpointSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l">api</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">ingress</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">fromEndpoints</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l">frontend</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">toPorts</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span>- <span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;8080&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l">TCP</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">rules</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">http</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span>- <span class="nt">method</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;GET&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;/api/v1/.*&#34;</span><span class="w">
</span></span></span></code></pre></div><p><strong>DNS-based egress</strong>: allow egress to <code>github.com</code> by hostname rather than IP address. This matters for external services with dynamic IPs.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">egress</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">toFQDNs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">matchName</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;github.com&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">toPorts</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;443&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l">TCP</span><span class="w">
</span></span></span></code></pre></div><p><strong>Identity-based policies</strong>: Cilium assigns a cryptographic identity to each pod based on its labels. Policies are enforced by identity, not IP address. Pod restarts (which change IPs) don&rsquo;t break policy enforcement.</p>
<hr>
<h2 id="what-a-real-namespace-policy-set-looks-like">What a real namespace policy set looks like</h2>
<p>For a typical web app with frontend, backend, and database:</p>
<pre tabindex="0"><code>Namespace: myapp
├── default-deny-all (ingress + egress, all pods)
├── allow-egress-dns (egress, all pods, port 53)
├── allow-ingress-frontend (ingress frontend, from sys-traefik namespace)
├── allow-egress-frontend-to-backend (egress frontend, to backend:8080)
├── allow-ingress-backend (ingress backend, from frontend)
├── allow-egress-backend-to-postgres (egress backend, to postgres:5432)
└── allow-ingress-postgres (ingress postgres, from backend)
</code></pre><p>Eight policies. The database has exactly one inbound path: from the backend. The frontend has no path to the database at all. A compromised frontend pod cannot scan the internal network — egress to arbitrary destinations is blocked.</p>
<hr>
<h2 id="what-interviewers-are-actually-testing">What interviewers are actually testing</h2>
<p>The follow-up is usually: <em>&ldquo;How do you manage this at scale? Writing NetworkPolicies for every namespace by hand doesn&rsquo;t scale.&rdquo;</em></p>
<p>The answer: you don&rsquo;t write them by hand. You template them. In a GitOps setup, your namespace configuration declares what network access the service needs in a structured form, and a Helm chart or operator generates the actual NetworkPolicy resources from those declarations.</p>
<p>For example, an <code>applications.yml</code> entry might look like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">networkPolicies</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">denyAll</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">allowIngressFromIngress</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">allowEgressToNamespaces</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&#34;sys-postgres&#34;</span><span class="p">]</span><span class="w">
</span></span></span></code></pre></div><p>And a Helm chart translates that into four concrete NetworkPolicy objects. The developer declares intent; the platform enforces it. No one writes raw YAML for each namespace.</p>
<p>The second follow-up: <em>&ldquo;What about east-west traffic between services in the same namespace?&rdquo;</em> Add <code>allowIntraNamespace: true</code> as a flag that generates a policy allowing all pod-to-pod traffic within the namespace, while still blocking cross-namespace traffic.</p>
<hr>
<p><em>This is part of a series on Kubernetes interview questions. Previously: <a href="/posts/k8s-zero-downtime/">zero-downtime deployments</a>. Next: <a href="/posts/k8s-config-drift/">preventing configuration drift</a>.</em></p>
]]></content:encoded></item></channel></rss>