<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>External-Secrets on hippotion</title><link>https://blog.hippotion.com/tags/external-secrets/</link><description>Recent content in External-Secrets on hippotion</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Fri, 19 Dec 2025 00:00:00 +0000</lastBuildDate><atom:link href="https://blog.hippotion.com/tags/external-secrets/index.xml" rel="self" type="application/rss+xml"/><item><title>🧱 How Do You Isolate Two n8n Tenants on Kubernetes — and Prove Each Wall Holds?</title><link>https://blog.hippotion.com/posts/n8n-multitenant/</link><pubDate>Fri, 19 Dec 2025 00:00:00 +0000</pubDate><guid>https://blog.hippotion.com/posts/n8n-multitenant/</guid><description>Multi-tenant isolation is easy to assert and hard to verify. Three walls — network, secret, resource — and the actual 403s, timeouts, and admission rejections that prove each one holds.</description><content:encoded><![CDATA[<h2 id="the-question">The question</h2>
<p><em>&ldquo;You&rsquo;re running n8n for multiple customers on the same Kubernetes cluster. What stops Customer A from reading Customer B&rsquo;s API keys, calling Customer B&rsquo;s services, or starving Customer B&rsquo;s workflows by burning the whole node?&rdquo;</em></p>
<p>Three different walls, three different mechanisms. Most articles I&rsquo;ve read on K8s multi-tenancy list the primitives — namespaces, NetworkPolicies, ResourceQuotas, RBAC — without showing what each one actually catches when you try to cross it. This post does the second part. The receipts are the point.</p>
<p>The setup: two namespaces, <code>web-tenant-acme</code> and <code>web-tenant-globex</code>, each running their own n8n instance on the same node. The only thing keeping them apart is the walls we build around each namespace.</p>
<hr>
<h2 id="the-mental-model-subtractive-isolation">The mental model: subtractive isolation</h2>
<p>Kubernetes is a flat network with shared everything by default. You don&rsquo;t <em>add</em> isolation by writing allow rules. You <em>subtract</em> trust by adding default-deny rules, and then carefully allow back only the connections each tenant actually needs.</p>
<p>A tenant doesn&rsquo;t have access to another tenant because there is <em>no rule allowing it</em>. The absence of an allow rule is the wall.</p>
<p>Three of these absences make up the picture:</p>
<table>
	<thead>
			<tr>
					<th>Wall</th>
					<th>Primitive</th>
					<th>Failure mode when crossed</th>
			</tr>
	</thead>
	<tbody>
			<tr>
					<td>Network</td>
					<td>Cilium NetworkPolicy, default-deny egress</td>
					<td>Connection times out (silent drop)</td>
			</tr>
			<tr>
					<td>Secret</td>
					<td>Vault Kubernetes-auth, per-tenant policy</td>
					<td><code>403 permission denied</code> from Vault itself</td>
			</tr>
			<tr>
					<td>Resource</td>
					<td>ResourceQuota + LimitRange</td>
					<td>Pod rejected at admission time</td>
			</tr>
	</tbody>
</table>
<p>Different layers, different error messages. That&rsquo;s how you can tell what stopped you.</p>
<hr>
<h2 id="wall-1--network-cilium-networkpolicy">Wall 1 — Network: Cilium NetworkPolicy</h2>
<p>n8n in <code>web-tenant-acme</code> can reach <code>whoami.web-tenant-acme.svc.cluster.local</code> (its own service in its own namespace) but not <code>whoami.web-tenant-globex.svc.cluster.local</code>. The same DNS shape, the same cluster, the same node. One succeeds, the other hangs.</p>
<p>The primitive is a default-deny egress policy applied to every pod in the namespace, with two narrow exceptions: intra-namespace traffic (so n8n can still reach its own service) and DNS to <code>kube-system</code> (otherwise nothing resolves anything).</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># Effective policy on every pod in web-tenant-acme:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">podSelector</span><span class="p">:</span><span class="w"> </span>{}<span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">policyTypes</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="l">Egress, Ingress]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">egress</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">to</span><span class="p">:</span><span class="w">                                     </span><span class="c"># intra-namespace traffic OK</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">podSelector</span><span class="p">:</span><span class="w"> </span>{}<span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">to</span><span class="p">:</span><span class="w">                                     </span><span class="c"># DNS to kube-dns OK</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">namespaceSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">kubernetes.io/metadata.name</span><span class="p">:</span><span class="w"> </span><span class="l">kube-system</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">ports</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>{<span class="nt">port: 53, protocol</span><span class="p">:</span><span class="w"> </span><span class="l">UDP}]</span><span class="w">
</span></span></span></code></pre></div><p>There is no rule for <code>web-tenant-globex</code>. Cilium&rsquo;s eBPF datapath drops the SYN packet on the way out.</p>
<p><strong>The receipt</strong> — an n8n HTTP node configured to GET <code>http://whoami.web-tenant-globex.svc.cluster.local/</code>. It hangs for the full timeout, then errors with <code>AxiosError: timeout of 5000ms exceeded</code> / <code>code: ECONNABORTED</code>.</p>
<p>The interesting bit: <strong>DNS still works.</strong> kube-dns is allowed, so the cross-namespace Service still resolves. The TCP handshake is what gets dropped. That&rsquo;s a useful signal in real incident response — &ldquo;DNS resolves but the connection hangs&rdquo; almost always means a NetworkPolicy is the cause.</p>
<hr>
<h2 id="wall-2--secret-vault-kubernetes-auth--eso">Wall 2 — Secret: Vault Kubernetes-auth + ESO</h2>
<p>Now imagine Acme&rsquo;s n8n misbehaves: somebody pushes a workflow that tries to read Globex&rsquo;s API keys via an <code>ExternalSecret</code>. The network isn&rsquo;t the issue — both tenants need to reach Vault, so they both have an egress rule for <code>sys-vault</code>. The wall has to be at the identity layer.</p>
<p>Each tenant gets three things:</p>
<ol>
<li>A dedicated <code>ServiceAccount</code> (<code>n8n-acme</code>, <code>n8n-globex</code>).</li>
<li>A Vault Kubernetes-auth <code>role</code> bound to that SA in that namespace, mapped to a Vault <code>policy</code> that grants <code>read</code> on <em>only its own</em> KV path.</li>
<li>A namespaced External Secrets <code>SecretStore</code> that authenticates as the SA via the Kubernetes TokenRequest API.</li>
</ol>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="cl"><span class="c1"># Vault policy: tenant-acme can read its own secrets, nothing else.
</span></span></span><span class="line"><span class="cl"><span class="n">path &#34;secret/data/web-tenant-acme&#34;     { capabilities</span> <span class="o">=</span> <span class="p">[</span><span class="s2">&#34;read&#34;</span><span class="p">]</span> }
</span></span><span class="line"><span class="cl"><span class="n">path &#34;secret/metadata/web-tenant-acme&#34; { capabilities</span> <span class="o">=</span> <span class="p">[</span><span class="s2">&#34;read&#34;</span><span class="p">]</span> }
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">vault write auth/kubernetes/role/tenant-acme <span class="se">\
</span></span></span><span class="line"><span class="cl">  <span class="nv">bound_service_account_names</span><span class="o">=</span>n8n-acme <span class="se">\
</span></span></span><span class="line"><span class="cl">  <span class="nv">bound_service_account_namespaces</span><span class="o">=</span>web-tenant-acme <span class="se">\
</span></span></span><span class="line"><span class="cl">  <span class="nv">policies</span><span class="o">=</span>tenant-acme <span class="se">\
</span></span></span><span class="line"><span class="cl">  <span class="nv">ttl</span><span class="o">=</span>1h
</span></span></code></pre></div><p>When Acme&rsquo;s n8n tries an <code>ExternalSecret</code> pointing at <code>secret/web-tenant-globex/...</code>, ESO authenticates fine (the SA is valid), Vault recognises the caller, looks up the <code>tenant-acme</code> policy, and answers with the most satisfying line in this whole demo:</p>
<pre tabindex="0"><code>URL: GET http://sys-vault.sys-vault.svc.cluster.local:8200/v1/secret/data/web-tenant-globex
Code: 403. Errors:
* permission denied
</code></pre><p>This is the bit that separates &ldquo;namespace isolation&rdquo; from real multi-tenant secret isolation. Plain Kubernetes Secrets + RBAC stop a tenant from <em>listing</em> another tenant&rsquo;s Secret objects, but the moment you go upstream — to Vault, to a cloud KMS, to an SSM Parameter Store — the secret store needs to enforce identity itself. The network said yes; the secret store still says no.</p>
<hr>
<h2 id="wall-3--resource-resourcequota--limitrange">Wall 3 — Resource: ResourceQuota + LimitRange</h2>
<p>The third concern is the noisy neighbour: Acme&rsquo;s runaway workflow allocating a 4Gi pod and OOM-killing everything else on the node. The network policy doesn&rsquo;t catch this (no network call), and Vault doesn&rsquo;t catch this (no secret request). The kernel will, <em>eventually</em> — but you don&rsquo;t want eventually. You want admission-time rejection.</p>
<p>Two primitives:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ResourceQuota</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w"> </span>{<span class="w"> </span><span class="nt">name: tenant-quota, namespace</span><span class="p">:</span><span class="w"> </span><span class="l">web-tenant-acme }</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">hard</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">requests.cpu</span><span class="p">:</span><span class="w">    </span><span class="s2">&#34;1&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">requests.memory</span><span class="p">:</span><span class="w"> </span><span class="l">1Gi</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">limits.cpu</span><span class="p">:</span><span class="w">      </span><span class="s2">&#34;2&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">limits.memory</span><span class="p">:</span><span class="w">   </span><span class="l">2Gi</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">pods</span><span class="p">:</span><span class="w">            </span><span class="s2">&#34;10&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">LimitRange</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w"> </span>{<span class="w"> </span><span class="nt">name: tenant-limits, namespace</span><span class="p">:</span><span class="w"> </span><span class="l">web-tenant-acme }</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">limits</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">Container</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">default</span><span class="p">:</span><span class="w">        </span>{<span class="w"> </span><span class="nt">cpu: 500m, memory</span><span class="p">:</span><span class="w"> </span><span class="l">512Mi }</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">defaultRequest</span><span class="p">:</span><span class="w"> </span>{<span class="w"> </span><span class="nt">cpu: 50m,  memory</span><span class="p">:</span><span class="w"> </span><span class="l">128Mi }</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">max</span><span class="p">:</span><span class="w">            </span>{<span class="w"> </span><span class="nt">cpu</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;2&#34;</span><span class="nt">,  memory</span><span class="p">:</span><span class="w"> </span><span class="l">1Gi }</span><span class="w">
</span></span></span></code></pre></div><p><code>ResourceQuota</code> caps the namespace total. <code>LimitRange</code> bounds any <em>individual</em> container and supplies defaults so pods that don&rsquo;t declare requests/limits still get reasonable ones — important because a missing limit on a single container can blow past the quota in one allocation.</p>
<p><strong>The receipt</strong> — a server-side dry-run of a single 4Gi pod, which never gets created:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ kubectl apply -n web-tenant-acme --dry-run<span class="o">=</span>server -f noisy-neighbor.yaml
</span></span><span class="line"><span class="cl">Error from server <span class="o">(</span>Forbidden<span class="o">)</span>: error when creating <span class="s2">&#34;STDIN&#34;</span>:
</span></span><span class="line"><span class="cl">pods <span class="s2">&#34;noisy-neighbor&#34;</span> is forbidden:
</span></span><span class="line"><span class="cl">  maximum memory usage per Container is 1Gi, but limit is 4Gi
</span></span></code></pre></div><p>Not a kernel OOMKill. Not a pod stuck in <code>Pending</code>. A flat refusal from the API server before the scheduler even sees the request.</p>
<hr>
<h2 id="what-this-does-not-prove">What this does <em>not</em> prove</h2>
<p>A homelab demo on one node with two synthetic tenants is not n8n Cloud. The honest gaps:</p>
<ul>
<li><strong>Execution sandboxing.</strong> A workflow can still run arbitrary code via the <code>Code</code> node or shell-outs. These walls stop <em>infrastructure</em> leakage; they don&rsquo;t sandbox what n8n itself executes. Real n8n Cloud needs more than namespace walls for that — gVisor / Firecracker / per-tenant worker pools are the usual answers, and n8n&rsquo;s <a href="https://docs.n8n.io/hosting/scaling/queue-mode/">queue mode</a> lends itself to the last.</li>
<li><strong>Pooled worker queues.</strong> Queue mode runs main/webhook/worker as separate deployments backed by Redis + Postgres. Two tenants sharing a worker pool need additional checks at the job-routing layer to keep workflows from accessing the wrong tenant&rsquo;s binary data. Out of scope for the homelab demo.</li>
<li><strong>Control plane.</strong> Both tenants reach the same API server. A cluster-admin-equivalent compromise breaks everything. This is the assumption every shared K8s setup makes.</li>
<li><strong>Node-level.</strong> Same kernel. Container escape, CPU side channels, the usual list — all apply. For paranoid tenants the answer is dedicated nodes via taints/tolerations or separate clusters entirely.</li>
</ul>
<p>The demo proves the <em>namespace-shaped</em> walls hold. It does not prove the whole stack is safe against a determined attacker already running code inside a tenant. That&rsquo;s a different post.</p>
<hr>
<p><em>Part of a Kubernetes-on-the-homelab series — previously: <a href="/posts/k8s-network-isolation/">preventing a compromised pod from calling your database</a>, <a href="/posts/k8s-gitops-secrets/">GitOps secrets</a>.</em></p>
]]></content:encoded></item></channel></rss>