<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Notes on hippotion</title><link>https://blog.hippotion.com/tags/notes/</link><description>Recent content in Notes on hippotion</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Sun, 21 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://blog.hippotion.com/tags/notes/index.xml" rel="self" type="application/rss+xml"/><item><title>📝 Dev Notes</title><link>https://blog.hippotion.com/posts/dev-notes/</link><pubDate>Sun, 21 Jun 2026 00:00:00 +0000</pubDate><guid>https://blog.hippotion.com/posts/dev-notes/</guid><description>Running notes on things I&amp;rsquo;ve hit, fixed, or found worth remembering.</description><content:encoded><![CDATA[<h2 id="kubernetes-init-container-crash-loop-leaves-dirty-emptydir">Kubernetes: init container crash loop leaves dirty emptyDir</h2>
<p>When a pod&rsquo;s init container crashes, Kubernetes restarts <strong>only the init container</strong> — not the whole pod. The <code>emptyDir</code> volume survives between retries. If your init container does a <code>git clone</code> into a fixed path, the second attempt fails with &ldquo;destination path already exists.&rdquo;</p>
<p>Fix: <code>rm -rf</code> the target dir before cloning.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">rm -rf /git/repo
</span></span><span class="line"><span class="cl">git clone --depth<span class="o">=</span><span class="m">10</span> --branch<span class="o">=</span>main https://... /git/repo
</span></span></code></pre></div><p>After many restarts, no manual cleanup needed. Events expire in ~1h, old pods are replaced automatically by the Deployment controller. Check recovery with:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">kubectl get events -n &lt;namespace&gt; --sort-by<span class="o">=</span><span class="s1">&#39;.lastTimestamp&#39;</span> <span class="p">|</span> tail -10
</span></span></code></pre></div><h2 id="a-cpu-spike-that-was-actually-memory-thrashing-adding-ga4-to-hugo">A &ldquo;CPU spike&rdquo; that was actually memory thrashing (adding GA4 to Hugo)</h2>
<p>Wanted Google Analytics on this blog. PaperMod already calls a <code>google_analytics.html</code> partial in <code>head.html</code>, but it&rsquo;s gated behind <code>hugo.IsProduction | or (eq site.Params.env &quot;production&quot;)</code>. My blog pod runs <code>hugo server</code>, which <strong>always</strong> reports the environment as <em>development</em> — so the partial never fires. I &ldquo;fixed&rdquo; that by setting <code>env = &quot;production&quot;</code>.</p>
<p>That was the wrong lever. <code>env = production</code> flips on Hugo&rsquo;s whole production path — minification, OpenGraph, Twitter cards, schema JSON across every page. The next full rebuild blew past the pod&rsquo;s <strong>128Mi</strong> memory limit and got <strong>OOMKilled</strong> (exit 137). Server load jumped.</p>
<p>The right way to add GA without touching the build mode: drop the tag in <code>layouts/_partials/extend_head.html</code>. PaperMod includes that partial <em>unconditionally</em>, above the production guard — so it loads under <code>hugo server</code> too.</p>
<p>But here&rsquo;s the part that fooled me. After reverting <code>env</code>, load was <em>still</em> climbing — to ~14 on a single node — and <code>ps</code> showed hugo at &ldquo;500% CPU&rdquo;. Looked like a runaway compute loop. It wasn&rsquo;t:</p>
<pre tabindex="0"><code>%Cpu(s): 2.1 us, 41.0 sy, 6.9 id, 50.0 wa     &lt;- 50% iowait, 2% userspace
PID ... S  %CPU  COMMAND
... D  333  hugo    &lt;- state D, RES pinned at 127MiB (the 128Mi cgroup limit)
</code></pre><p>Two lessons:</p>
<ol>
<li><strong><code>ps %CPU</code> is a lifetime average</strong>, not instantaneous. A process that ran hot for 1s then blocked still shows a big number for a while. Use <code>top</code> for what&rsquo;s happening <em>now</em>.</li>
<li><strong>High load + high <code>%wa</code> + a <code>D</code>-state process sitting at its cgroup memory limit = memory thrashing, not CPU.</strong> Hugo wasn&rsquo;t computing — it was wedged against the 128Mi ceiling, and every allocation triggered kernel reclaim/swap. A sub-second build dragged out for minutes in uninterruptible I/O sleep, and all those blocked tasks are what inflate load average (Linux counts <code>D</code>-state in load).</li>
</ol>
<p>The actual fix was boring: 128Mi was always marginal for <code>hugo-extended</code> + PaperMod. Bumped the limit to 512Mi and the thrash vanished.</p>
<p>Takeaway: when load spikes, read <code>%wa</code> and process state before blaming the CPU. And don&rsquo;t flip <code>env=production</code> on a long-lived <code>hugo server</code> just to ungate one partial — use <code>extend_head.html</code>.</p>
<h2 id="self-hosting-supabase-lean-on-k3s-the-gotcha-checklist">Self-hosting Supabase (lean) on k3s: the gotcha checklist</h2>
<p>Ran the community <code>supabase/supabase</code> chart on a 16Gi single node — enabled db, rest, auth, meta, studio, kong + the log pipeline (analytics/Logflare + vector); left realtime, storage, imgproxy, edge-functions off. The deploy is easy; these are the things that actually bit:</p>
<ul>
<li><strong>Studio shows &ldquo;no tables&rdquo;.</strong> Supabase is single-database by design — Studio, PostgREST and auth all use the database named <code>postgres</code>. App tables in a <em>separate</em> database are invisible to all of it. Put your schema in <code>postgres</code>&rsquo;s <code>public</code> schema.</li>
<li><strong>Studio won&rsquo;t schedule with edge-functions disabled.</strong> Its Deployment mounts the functions PVC unconditionally. Either run functions, or create the PVC yourself and leave functions off.</li>
<li><strong>edge-functions crashloops</strong> if you keep it: it boots by fetching a Deno module from the internet, which a deny-all egress policy blocks. You usually only want the PVC it leaves behind anyway.</li>
<li><strong>vector (log collector) stays silent</strong> under a deny-all policy. It discovers pods via the Kubernetes API, so it needs <strong>API egress</strong>, not just app ports (<code>allowEgressToKubeApi</code>). A log shipper that can&rsquo;t reach the API collects nothing and doesn&rsquo;t say why.</li>
<li><strong><code>secretRef</code> must contain <em>every</em> key the chart maps</strong> — including non-secret ones like <code>database</code> and <code>openAiApiKey</code>. Miss one and pods sit in <code>CreateContainerConfigError</code>.</li>
<li><strong>ESO <code>ExternalSecret</code> shows perpetual <code>OutOfSync</code> in Argo CD</strong> unless you spell out the remoteRef defaults (<code>conversionStrategy: Default</code>, <code>decodingStrategy: None</code>, <code>metadataPolicy: None</code>) — ESO writes them back, and the compact form drifts.</li>
<li><strong><code>postgres</code> is not a superuser.</strong> <code>CREATE DATABASE … OWNER app</code> fails with <code>must be member of role</code>. Supabase keeps the real superuser (<code>supabase_admin</code>) to itself; <code>GRANT app TO postgres</code> first.</li>
<li><strong>Logflare needs no BigQuery.</strong> It runs on the self-hosted Postgres backend (the <code>_supabase</code> database, <code>_analytics</code> schema) — logs land in <code>_analytics.log_events_*</code>.</li>
</ul>
<p>None of this is in the README. It&rsquo;s the gap between &ldquo;I deployed Supabase&rdquo; and &ldquo;I run it.&rdquo;</p>
]]></content:encoded></item></channel></rss>