The AI Coding Agent Bugs I Catch Every Week
Eight failure patterns I see running AI coding agents daily — the confident wrong answers, the lost context, and the bugs they reliably ship.
We can't find the internet
Attempting to reconnect
Something went wrong!
Attempting to reconnect
Writing · Tag
12 posts on agents. Or browse the full writing index →
Eight failure patterns I see running AI coding agents daily — the confident wrong answers, the lost context, and the bugs they reliably ship.
Every AI agent framework runs the same loop: observe, decide, act, repeat. Here it is in 50 lines of Elixir — no framework, just a GenServer.
Go won the agent daemon and Python owns the reasoning — but the layer nobody claimed is the one that bites you: thousands of long-lived, stateful, crash-prone agents you must keep alive. That's the BEAM's home turf.
Most engineers prompt Claude one sentence at a time. Anthropic's own engineers don't — they prompt skills. Four rules from their recent talks, with the operator nuance the talks left out.
Open the repos behind the agent tooling you run — Ollama, the MCP SDKs, the orchestration engines — and it's all Go. Not because Go is good at AI. Because an agent tool is a concurrent network daemon that ships as one binary.
Ruflo (formerly Claude Flow) is a hive-mind orchestration layer for Claude Code and friends. 45,000+ GitHub stars, 700,000+ npm downloads, three queen-types...
46 tools across the Claude Code ecosystem, organized by category (official, directories, MCP servers, skills, multiplexers, agent frameworks, automation)...
Replit's AI agent ignored a code freeze, wiped a production database in nine seconds, then confessed it violated every principle it was given. The strongest case yet for hiring MORE senior engineers in the AI boom — not fewer.
Every company rolling out AI is about to discover how much work they were leaving on the table. AI doesn't replace headcount — it surfaces the backlog you never had bandwidth to touch. The math behind why velocity creates surface area, the failure mode that follows, and why the companies cutting headcount now are about to get outpaced.
AI-native companies need a security model that classic appsec doesn't cover. Agents have credentials. Prompts are an attack surface. Training data leaks. The four-layer security stack I'd build, the controls I'd ship in the first 90 days, and the ones I'd defer.
The hardest part of agentic AI in 2026 isn't getting the agent to do the work. It's knowing when to override it. The four-level autonomy ladder, the five signals an agent is going off the rails, and a real example of catching one before it shipped a quietly broken auth flow.
A walkthrough of how I run 4–7 agent sessions in parallel through a normal engineering day. Morning background tasks, mid-morning pair programming, afternoon reviews, end-of-day ops. The interaction modes that work, the handoff protocol, and the trap that makes most agent workflows produce slop.