Nvidia

🍵 I A/B-Tested Cloud vs Local LLMs in One n8n Agent. The Local One Faked It.

I built an AI agent in self-hosted n8n over my kombucha-tracking app, then gave it two brains — NVIDIA’s 70B and a local Phi-3.5 — sharing the same tools. The cloud model called the tools and answered from real data. The local one couldn’t, so it made things up.

the little robot stands guard at a doorway like a friendly bouncer, holding up a hand to check a stack of papers, while the boy in the cap watches; a shield symbol floats above them, protective and watchful

🔒 Building a PII Guardrail Proxy for Cloud LLM Calls

A local model classifies every prompt before it leaves the cluster. If it’s sensitive, it’s blocked. If it’s clean, it goes to NVIDIA NIM. 150 lines of FastAPI, deployed on k3s.