Automation on hippotion

Two Birds That Read the Web for Me: One Hoards, One Scatters

Fri, 12 Jun 2026 00:00:00 +0000

I have a vault of markdown notes that I treat as a second brain, and I run GitOps over it like it’s production infrastructure. It already has agents that work on it from the inside: a nightly gardener that weeds orphans and suggests links, and a Wanderer that collides random pairs of my own notes looking for connections I missed.

The obvious next move is to point an agent at the outside — let it read the web and tell me what matters. That move is also a small landmine, and most “AI reads the internet for you” tooling steps right on it. So this week I built two of them instead of one, named them after corvids, and the reason there are two is the entire point of this post.

Meet the Magpie and the Blue Jay.

The same fear, twice

Before either bird got a name, both inherited a single non-negotiable rule, and it’s worth saying plainly because it’s the part everyone skips:

An agent that reads the internet and writes to your notes is a prompt-injection pipeline aimed straight at your trust root.

My vault isn’t just storage. Every other agent — the gardener, the Wanderer, the search that answers “what am I building?” — reads it as trusted context. So the moment one agent ingests a GitHub README or a news headline (attacker- influenceable text) and is allowed to write a note, a stranger on the internet gets to whisper instructions into the thing my whole system believes. “Structured API” narrows that surface. It does not close it.

Both birds are built on the same chassis as the gardener, and that chassis enforces the fear rather than trusting the model to behave:

Two phases, hard split. A wrapper-owned FETCH step pulls the external text in plain Bash — Claude is not in the loop, can’t be talked into anything, because it isn’t running yet. Then a COLLIDE step starts claude -p with the fetched text handed in as inline data, and that process gets only Read / Glob / Grep / Write. No Bash, no git, no network, no MCP. While untrusted text is in the context window, the agent has no tool that can reach the outside world or rewrite history.
Allowlist, not the open web. Each bird reads a short, named list of sources. Nothing else.
Quarantine, not the vault. Findings land in quarantine//, which lives outside vault/. The indexer never sees it. Nothing it writes is ever auto-wikilinked into the graph. Promotion to a real note is a thing I do, by hand, after reading it.
Blast radius is checked, not assumed. A run may modify only its quarantine directory. Anything written anywhere else is discarded and reported as a violation.
“Nothing found” is a successful run. Neither bird has a quota. This is the honesty contract I stole from the Wanderer — an agent under pressure to produce N findings will manufacture N findings, and manufactured insight is worse than silence.

That’s the shared spine. Now the interesting part: given the same security model, the two birds do almost opposite things, and trying to make one bird do both jobs would have quietly ruined it.

The Magpie hoards what’s already shiny

A magpie collects shiny objects and keeps them close. Mine watches my own GitHub stars.

The premise is slow public signal × private context. I starred some repo three weeks ago, forgot about it, and moved on. Meanwhile my projects shifted. The Magpie runs weekly, pulls my starred repos through one allowlisted endpoint (gh api user/starred), and collides each one against what I’m actively building right now — the live projects, the open hubs.

Its output contract is a tight one: it is a relevance filter. It fires only when a star actually touches live work, and every finding has to name three concrete things — the repo, the project it connects to, and one “so what.” A vague “these are thematically related” doesn’t count as a hit. It’s a watchdog on the dials, not a newsletter.

The supervised proof run, over 28 stars, surfaced exactly two real hits and refused to invent a third:

supertonic (on-device multilingual TTS) × my Hungarian-audiobook voice-cloning project — a possible escape from a TTS fight I’d been losing. I checked: it genuinely supports Hungarian. That’s a hit with a so-what.
agentmemory × the exocortex itself — prior art for persistent AI memory, notably with benchmarks my own notes lacked. (And if you’ve read about the time I benchmarked my own search and it lost, you’ll know how much I needed that nudge.)

The other ~22 stars mapped to tidy thematic clusters and were correctly not reported. That restraint is the feature.

The Blue Jay scatters acorns and forgets where

Here’s the bird that explains why there are two.

Blue jays don’t hoard close like magpies. They cache acorns far and wide and forget where they buried some — and the forgotten ones grow into oak trees. Ecologists think blue jays are why oak forests spread north after the last ice age. Seed dispersal, by way of a bad memory. That is exactly the job I wanted for the second bird, and the metaphor was too good to pass up.

The Blue Jay reads an allowlist of eight RSS feeds, picked so tech and science cross-pollinate:

Tech: Hacker News (high-score front page), lobste.rs, Ars Technica
Science & ideas: phys.org, Quanta, Aeon, Nautilus
Wildcard: Medium — but scoped to specific tag feeds, never the raw firehose of crypto and self-help

Quanta, Aeon, and Nautilus are on that list on purpose: they’re the connective tissue, the feeds where “huh, that’s weirdly similar to…” happens before my vault even gets involved.

And its output contract is the opposite of the Magpie’s. The Blue Jay is a serendipity filter. Its job is to surface the connection that isn’t in my projects yet — the distant idea, the acorn worth burying. If I ran it through the Magpie’s “only fire on a live-work hit” rule, I would strangle the one thing it exists to do. Relevance and serendipity pull in opposite directions, and you can’t tune a single agent to maximize both.

One more load-bearing detail, half design and half security: the Blue Jay collides on the RSS summary only — title, abstract, link. It never pulls the full article body into context. That’s simultaneously the lower-injection path and the right cognitive shape (a headline is a seed; I click through myself from quarantine if the seed is interesting). The narrow input is doing double duty.

Why two birds and not one with a flag

I genuinely considered making this one agent with a --mode=relevance|serendipity switch. I’m glad I didn’t, and the reasoning generalizes past birds:

	Magpie	Blue Jay
Source	my GitHub stars (structured API)	8 RSS feeds (open prose)
Injection risk	low	the highest frontier
Fires when	a star hits live work	a summary sparks a distant idea
Output	relevance: repo → project → so-what	serendipity: the not-yet-relevant connection
Failure mode it guards against	noise / false relevance	being strangled into silence

Two things made the split non-negotiable. First, the output contracts are too different to share one brain — “only speak on a hit” and “speak about the thing that isn’t a hit yet” are contradictory prompts, and a single agent told to do both does neither well. Second, open news is a higher injection frontier than a structured stars API, so the riskier bird deserves its own enforced blast-radius wrapper, not a code path bolted onto the safe one. When two jobs disagree on both what good output is and how dangerous the input is, that’s not a flag. That’s two programs.

So now my vault has two more agents reading the world on a cron. The Magpie runs Saturday at 06:00 and tells me when something I bookmarked finally became relevant. The Blue Jay runs Saturday at 07:00 and buries acorns in a quarantine folder, most of which I’ll ignore — but I only need one of them to grow into an oak.

Both are on probation for their first few runs, because I don’t trust a thing that reads the internet until I’ve watched it behave. But the part I’m actually happy about isn’t the agents. It’s that building the second one forced me to say out loud what the first one was secretly assuming — and the names made the difference impossible to forget. A magpie hoards. A blue jay scatters. You want both, and you do not want them to be the same bird.

Mind the gap: I pointed monitoring at my own skill set

Fri, 27 Mar 2026 00:00:00 +0000

A while back I applied for a senior platform role at n8n and didn’t land it. Fair enough — but “fair enough” isn’t actionable. Rejections come with no logs, no metrics, no trace. For someone who runs thirty-odd services with full observability, having vibes as the only instrumentation on my own career felt architecturally embarrassing.

So I built mind-the-gap: a pipeline that measures what the market demands, diffs it against what I can prove, and renders the gap as a private dashboard on my cluster. The job hunt is now a monitored system. This post is about the non-obvious decisions.

Demand: an LLM reads job listings so I don’t have to

I already had a job poller — an n8n workflow that polls the public ATS APIs (Greenhouse / Lever / Ashby) of ~33 companies plus a broad remote-jobs feed every six hours. A sibling workflow now re-fetches the same boards and, for every listing that passes the role+location gate, asks a small hosted LLM (Llama-3.1-8B) for a structured extraction:

{"seniority": "senior", "skills": [{"name": "kubernetes", "importance": "must"}, ...]}

One row per (job, skill) lands in an n8n Data Table. Decisions that mattered:

One LLM call per job, not one batch. Free-tier inference times out on batches; per-job calls are slower but fail independently. A lesson the poller already paid for.
Insert doubles as the processed-marker. A job whose extraction fails to parse produces no rows — so it’s retried next run, for free. No status column, no second table.
Canonicalization in code, not in the prompt. The model says “K8s”, “k3s”, “EKS” on different days regardless of instructions. A dumb alias map (k8s→kubernetes, eks→aws) beats prompt engineering for consistency.
8B is good enough — with a guard. It occasionally echoed the seniority enum back literally ("junior|mid|senior|staff|lead|unspecified"). The fix is one line of validation, not a bigger model.

Supply: no artifact, no credit

The other side of the diff is a skills registry — markdown in my knowledge vault, with a machine-parseable YAML block. Every skill has a state, and the rule that keeps the whole thing honest is brutal: a skill counts as proven only if an artifact exists — a public repo, a blog post, documented production experience. Otherwise it’s claimed, and claimed earns half credit.

That rule immediately produced the most useful insight of the project: “invisible skill” is a real category. Python turned out to be the market’s #5 ask. I use it constantly — and could point to nothing public that shows it. The cheapest score increase isn’t learning something new; it’s a weekend making an existing skill visible. No gut-feeling gap analysis would have ranked “write about what you already do” above “learn the shiny thing.”

The score: distinct companies, not mentions

First naive aggregation: Canonical’s listings mention Ubuntu nine times, all marked must-have — suddenly Ubuntu looks like the hottest skill in Europe. Employer skew is the noise floor of small samples. The fix: demand weight = distinct companies naming the skill, not total mentions. One enthusiastic employer can’t move the radar.

Two more scoring rules I’d defend in review:

Skills named by fewer than two companies don’t count at all — single-listing noise stays out.
Demand the registry hasn’t classified yet shows up as “unreviewed” and counts fully against the score. An unreviewed market signal is a gap until proven otherwise; the dashboard nags me to triage it.

Rendering: the page is a git commit

The dashboard is a single static HTML file, and the pipeline that produces it never touches the cluster. render.js lives in this repo as the single source of truth; a nightly n8n workflow fetches it raw from GitLab, eval()s it against the Data Table rows and the registry, and — only if the result differs from what’s committed (timestamps stripped, or every night is a “change”) — PUTs the new index.html back via the GitLab API.

Serving is the same pattern as this blog: nginx plus a git-pull sidecar, deployed by Argo CD, behind the cluster’s OAuth middleware. The renderer has no kubeconfig, no SSH, no cluster access of any kind. GitLab stays the only source of truth — even for a page that rewrites itself nightly. If the workflow goes rogue, the worst it can do is a reviewable commit.

Day-one verdict

First run: 2,297 postings fetched, 25 in scope, 257 skill rows. Coverage score: 63%. Kubernetes and AWS tied at the top of demand — which means the AWS gap-closing project already in flight stopped being a hunch and became the measured top of the market. Go is the only top-ten demand with zero supply. The dashboard doesn’t get anyone a job; it just makes sure every learning Saturday is pointed where the data says, not where the hype does.

The job board rejected me. The data didn’t.

Workflows, render.js, and setup: github.com/janos-gyorgy/mind-the-gap.

🌙 Killing Mildew in the Dark

Sun, 15 Mar 2026 00:00:00 +0000

I saw a clip of an autonomous farm robot — TRIC Robotics — driving strawberry beds in total darkness, killing pathogens with UV light instead of spraying them. Zero chemicals, zero runoff. My first reaction was “that’s a marketing robot.” My second, after reading, was “no, the science is real — and the robot is the least interesting part.”

The interesting part is why it works at night.

The trick is the darkness, not the light

UV-C light (254 nm) shreds the DNA of fungal pathogens like powdery mildew. Nothing new there — it’s the same wavelength that sterilises water and hospital rooms. The problem is that in daylight those pathogens repair the damage, using a light-activated enzyme (photoreactivation). Zap them at noon and they patch themselves up by evening.

So you do it in the dark. With the repair pathway switched off, a tiny dose sticks. Cornell’s Gadoury lab spent years on this: nighttime UV-C at doses around 85 J/m² once a week gave season-long powdery mildew control on strawberries that beat the best available fungicides. Grapes, cucumbers, roses — same story. Applied about 30 minutes after sunset, finished within a couple of hours.

That’s a genuinely beautiful result. Not a new chemical, not a stronger lamp — just the same old light, applied when the enemy can’t fix itself.

What it is, and what it absolutely isn’t

Before anyone rips out their whole garden routine: this is not a general pesticide replacement. The evidence is strong for one specific class of problem — surface fungal pathogens, mostly powdery and downy mildew on susceptible plants (strawberry, grape, cucurbits, roses). It does nothing for slugs, most insects, or anything in the soil.

So the honest pitch is narrow: if you fight recurring mildew every summer, this is a chemical-free tool that genuinely works. If your real enemy is aphids, don’t build this — you’d be solving the wrong problem with a dangerous toy.

Which brings me to the toy being dangerous.

UV-C is not mood lighting. Seconds of direct exposure burn your eyes (welder’s-flash) and skin, and it’s a long-term cancer risk. This is the single reason a home version has to be designed carefully — and the reason I’d never run an exposed source in a garden where my kids play.

Any home rig needs, non-negotiably:

A physical enclosure or skirt so the light only hits the bed, never a person.
A hard interlock — a motion sensor or door contact that cuts power instantly if anything moves into range.
A schedule that only ever runs in the dead of night, when everyone’s inside and asleep.

You can also over-dose the plants — too much UV-C scorches leaves. The whole point is that the effective dose is tiny, so more is not better.

The build (the home version of “while you sleep”)

You don’t need TRIC’s autonomous navigation. A home garden has fixed beds — so the robot problem collapses into a much simpler one: get a shielded lamp over a known bed, for a known number of seconds, at night. That’s not robotics. That’s a timer and a rail.

Here’s the plan I’d build:

The lamp. A low-pressure UV-C tube (254 nm — not the “UV-C LED” novelties, and not ozone-generating 185 nm lamps). Mounted in a hooded reflector so the light points down and is blocked from the sides.
The geometry. Fix it at a set height over the bed — on a simple cart that rolls a track, or just a static fixture over a raised bed. Fixed height = repeatable dose.
The dose, measured not guessed. This is the one place you can’t wing it: borrow or buy a UV-C meter, measure the irradiance (W/m²) at canopy height, then time = 85 ÷ irradiance. If the lamp delivers, say, 5 W/m² at the leaves, that’s ~17 seconds of exposure. Seventeen seconds, once a week. That tiny number is the whole reason this is plant-safe and low-energy — and why a slow-moving robot pass is enough on a farm.
The brain. This is the bit that’s actually in my wheelhouse: an ESP32 + a relay, on the homelab. Fires at 2 a.m. for N seconds, once a week. A PIR sensor wired as a kill-switch. A mind-the-gap-style cron and a log line to my phone when it ran. The “autonomous robot working while you sleep” headline, minus the $100k of autonomy I don’t need for four raised beds.

Verdict

I haven’t built this yet — it’s a someday project, parked here so I stop losing the idea. But it’s the rare someday project where the science is settled, the materials are cheap, and the only real engineering is safety and dose control, both of which are squarely the kind of problem I like.

The farm robot’s pitch is “pesticide-free at scale.” The home version’s pitch is smaller and more honest: if mildew is your summer tax, you can pay it in seventeen seconds of midnight light instead of a spray bottle. I’ll take that trade.

When I build it, the failure log gets its own post.

🎯 Know the Market Without Job-Hunting: An LLM-Scored Job Poller in n8n

Fri, 13 Feb 2026 00:00:00 +0000

You don’t have to be about to change jobs to want to know the landscape. What’s being built, what it pays, where you’d actually fit — staying current on the market (and your own worth) is just good professional hygiene. The trouble is that checking is tedious, so most of us don’t, until we’re already job-hunting and starting cold.

So I automated mine. An n8n workflow on my homelab polls job boards every six hours, scores each new posting against my profile with an LLM, and emails me only the strong matches — the ones scoring 80%+. When it’s quiet, it’s silent. When something genuinely fits, I know the same day. Here’s what I learned building it. Repo at the bottom.

Three APIs cover most of the market

Company career pages look bespoke, but underneath, the vast majority run on one of three ATS — and all three hand you the jobs as unauthenticated JSON:

Greenhouse — boards-api.greenhouse.io/v1/boards/{token}/jobs?content=true
Lever — api.lever.co/v0/postings/{token}?mode=json
Ashby — api.ashbyhq.com/posting-api/job-board/{token}?includeCompensation=true

No scraping, no headless browser. You poll the API the page itself calls, normalize the three shapes into one { company, title, location, remote, url, posted_at, description, external_id }, and you’re done with the hard part.

“Resolve the token” is half the battle

The naive assumption — the token is the company name, and everyone’s on one of the three — is half right. When I probed my initial wishlist, roughly half 404’d everywhere: HashiCorp (now under IBM → Workday), SUSE (SuccessFactors), Aiven (Teamtailor), Hugging Face. They’re on a fourth or fifth system entirely. The honest move was to ship the ~33 that actually resolve and leave the rest as disabled config stubs. Verify before you trust a slug.

Dedup without a database

I didn’t want to stand up Postgres just to remember which jobs I’d already seen. n8n’s Data Tables handle it natively: a seen_jobs table, an external_id namespaced {ats}:{company}:{id}, and the rowNotExists operation drops anything already recorded. State lives inside n8n, backed up with it. Zero extra infrastructure.

The ordering matters: notify first, mark seen second. The insert only happens after the email sends, so a failed send retries next run instead of silently swallowing a posting.

The location filter is a trap

My first version kept everything that wasn’t explicitly US-based. The inbox filled with “Senior Platform Engineer — Spain (Remote)” and "… — United Kingdom (Remote)". Those aren’t remote-for-me — they’re remote if you live in Spain. Useless from where I sit.

The fix was to invert the logic. Keep only three things:

globally-remote / worldwide / anywhere,
pan-EU (EMEA / Europe / EU / EEA),
my own country.

…and drop single-country remote, even EU ones. Region and home matches win over the country deny-list, ambiguous locations are kept (a missed match is worse than one extra line to skim). That one change cut the noise more than anything else.

Let an LLM read the actual job

Keyword + location filtering gets you a candidate list, but it can’t tell a “Platform Engineer” who herds Kubernetes from a “Platform Engineer” who owns a Figma design system. The job description can.

So the last step scores each new posting against my CV. My first version batched all of them into one big LLM call — which promptly timed out on the free tier. The fix was the opposite: one small call per job, which also means a single slow or rate-limited job never sinks the batch. Each call asks a NVIDIA NIM model (Llama 3.1 8B, OpenAI-compatible) for one number and a reason:

Score this job 0–100 for fit against my profile. Return {score, reason}.

That score is what lets me widen the net instead of narrowing it. On top of the curated company list I pull a broad remote-jobs feed (every company, all categories); the cheap keyword + location filters do the first pass, then I only email the roles scoring 80%+. Casting wide is fine when a model is the bar at the door. A line ends up looking like:

92% — Grafana Labs — Senior Platform Engineer (Remote, EMEA) — strong k8s/GitOps overlap — link

Scoring is fail-safe: if a call hiccups, that job is just skipped, and every posting gets marked seen either way — so nothing re-scores forever, and a rare bad run never floods or stalls the inbox.

The unglamorous bits that make it trustworthy

One bad source can’t kill the run — every fetch is wrapped; failures become a ⚠️ N sources failing footer so a company quietly changing ATS is visible, not invisible.
A prime run seeds the table silently the first time, so I’m not buried under every currently-open role on day one.
Everything tunable lives in one Config node — companies, keywords, location lists, the profile, the model — so adding a company is a one-line edit, not a graph safari.

Takeaways

The “scrape job boards” problem mostly isn’t a scraping problem — it’s three public APIs and a normalizer.
For personal automation, reach for the boring-but-correct primitive: native dedup state beats a database you have to operate.
An LLM works best here as the bar at the door: cheap deterministic filters keep the candidate set (and the cost) small, then the model gates on real fit — which is what lets you cast a wide net without drowning in it.

Workflow JSON, the full node-by-node breakdown, and setup notes: github.com/janos-gyorgy/ats-job-poller.