Elixir's BEAM Is the Runtime AI Agents Want

Jared Smith May 31, 2026 12 min read Elixir agents AI

Go won the agent daemon and Python owns the reasoning — but the layer nobody claimed is the one that bites you: thousands of long-lived, stateful, crash-prone agents you must keep alive. That's the BEAM's home turf.

TL;DR: The AI-agent language argument has two camps and both are right about the wrong layer. Go won the transport — the MCP server, the daemon, the thing that ships as one binary. Python owns the reasoning — the prompts, the evals, the model glue. But there’s a third layer nobody is fighting over, and it’s the one that actually hurts in production: keeping thousands of long-lived, stateful agents alive while they crash, retry, and hold conversation state for hours. That’s not a daemon problem and it’s not a prompt problem. It’s a supervised-process problem, and the BEAM — Erlang’s virtual machine, the one Elixir runs on — has been the best tool on earth for it since before the word “agent” meant this. Here’s the honest version, including where the BEAM is the wrong call.

The layer nobody named

A while back I pulled apart why every AI agent framework is written in Go. The argument held up: an agent tool — an MCP server, a CLI, an orchestrator — is mechanically a concurrent network daemon that has to ship as one file into someone else’s machine, and Go is unreasonably good at exactly that. The reasoning content stays in Python. The conclusion was a split: Go owns the daemon, Python/TS owns the reasoning, talk over a wire between them.

That post named two layers and quietly walked past a third. I even flagged the crack at the time — that a panic in one goroutine “is not isolated the way people assume,” and pointed at the failure mode I went deep on in Elixir’s concurrency model. That crack is the whole subject of this post.

Because here is the thing an MCP server is not: it is not the agent. The MCP server is plumbing — it answers tools/call and goes back to sleep. The agent is the thing on the other side of the model call: the long-lived entity that holds a conversation, remembers what it was doing, calls six flaky tools in sequence, gets a malformed response from one of them, and has to either recover or die without taking its eleven thousand siblings down with it. That entity isn’t a daemon and isn’t a prompt. It’s a process with a lifecycle, and which language’s runtime owns that layer is a question both camps skipped.

What an agent actually is, mechanically

Strip the word “AI” off an agent the same way we stripped it off the tool, and describe what’s left to a backend engineer.

It’s an entity that starts up when a user begins a session. It holds state — the conversation, the scratchpad, the half-finished plan — for the entire duration of that session, which might be seconds or might be hours. While it’s alive it makes a sequence of unreliable calls: model APIs that rate-limit and time out, tools that throw, subprocesses that hang. Some fraction of those calls will fail in ways you did not anticipate, because the failure is coming from a stochastic model deciding to emit malformed JSON, or a third-party API having a bad afternoon. When one of those failures happens, the correct behavior is almost never “take down the server.” It’s “this one agent’s current step failed; retry it, reset it to its last good state, or let this one agent die — and leave the other ten thousand completely untouched.”

Now describe that to someone who wrote telecom software in the nineties and they will name the primitive before you finish the sentence: it’s a supervised process. One process per agent, holding its own isolated state, linked to a supervisor that knows what to do when it dies. You did not invent a new architecture for AI. You rediscovered OTP.

This is the tell. The shape of “many independent, stateful, long-lived, failure-prone things that must be isolated from each other” is not new and was never about AI. It’s the shape of phone calls, of chat sessions, of multiplayer game state — the workloads the BEAM was purpose-built for. Agents just happen to have that exact shape, and most teams are discovering it the hard way by reimplementing supervision badly in a runtime that doesn’t have it.

Why the BEAM fits the shape

Four properties, each of which is a direct answer to a problem the agent workload creates. None of them is an AI feature — that’s the point, the same way Go didn’t win the daemon by being good at AI.

Real process isolation, not the goroutine kind. A BEAM process has its own heap, its own stack, and its own garbage collector. Nothing is shared by default; processes communicate only by copying messages. The practical consequence is the one that matters for agents: when a process crashes, the blast radius is that process. Its memory is reclaimed, its siblings never notice. Contrast the goroutine, which I was careful about in the Go post: goroutines share an address space and an unrecovered panic in one goroutine takes down the entire OS process — every other in-flight agent with it. You can paper over that with recover() at every boundary, but you are hand-rolling, imperfectly, the isolation the BEAM gives you for free. For a system whose defining characteristic is “individual units fail constantly and unpredictably,” shared-fate concurrency is the wrong default and isolated-fate concurrency is the right one.

Supervision trees turn “let it crash” into a retry strategy. OTP’s supervision model — a tree of supervisor processes whose only job is to start, monitor, and restart their children according to a declared policy — came out of Ericsson’s work on systems that were not allowed to go down. Joe Armstrong’s 2003 thesis was literally titled Making reliable distributed systems in the presence of software errors, and the AXD301 switch built on these ideas famously reached availability figures quoted in the nine-nines range. The philosophy is “let it crash”: don’t litter defensive try/catch through your business logic trying to anticipate every failure; let the process die cleanly and let a supervisor restart it from a known-good state. Read that sentence again with an agent in mind. An agent step that fails on a bad model response should crash and restart from its last checkpoint — that’s not a workaround, it’s the designed-for case. The retry-with-backoff, reset-to-checkpoint, escalate-after-N-failures logic that agent frameworks in other languages write by hand is a Supervisor strategy you declare on the BEAM.

One process per agent is genuinely free. The objection to “a process per agent” in most runtimes is cost — OS threads are expensive, so you pool and multiplex and lose the isolation. On the BEAM the objection evaporates. A freshly spawned process starts at roughly 2–3 KB of memory and is created in microseconds; a single node sustains hundreds of thousands to millions of concurrent processes without breaking a sweat. So you don’t pool agents onto shared workers and reintroduce shared fate. You give every agent — every conversation, every sub-task, every tool invocation if you want — its own process, with its own state and its own crash domain, and you do it by the hundred thousand. The architecture you’d want on paper (total isolation) is also the cheap one, which is not a tradeoff you usually get to make.

Preemptive scheduling means one wedged agent can’t starve the rest. The BEAM scheduler is preemptive: it counts reductions (roughly, work units) and forcibly yields a process after a small budget, so no single process can monopolize a scheduler thread. For agents this matters more than it first looks. A long-running agent that does something CPU-heavy — parsing a huge document, a tight retry loop, a runaway tool — cannot wedge the runtime and freeze every other agent’s progress. Go’s scheduler is cooperative at the edges and a tight CPU loop can misbehave; Python’s GIL serializes CPU-bound work outright (free-threading is coming, but it isn’t the world most agent code runs in today). The BEAM’s “everyone gets a fair slice, always” is the property you want when you’re running a noisy crowd of independent agents of wildly varying behavior on one box.

Put those four together and you have a runtime whose native unit is the long-lived, isolated, supervised, fairly-scheduled stateful process. That is the agent, described exactly.

It’s not vaporware

The fair pushback is “great theory, but is anyone actually building agents on this, or is it a forum argument?” The ecosystem is real and getting realer fast — I watched the SERPs fill with it while researching this.

Jido is an OTP-native autonomous-agent framework: agents are supervised processes with an immutable functional state model, and the AI layer is optional — the core gives you the agent architecture (planning, actions, lifecycle) and you bolt the LLM on. That factoring is the whole thesis of this post shipped as a library.

LangChain for Elixir — the langchain Hex package maintained by Mark Ericksen at Fly.io — is the pragmatic model-integration layer: a clean client for OpenAI, Anthropic, and the rest, with tool-calling and structured output, so the “reasoning” wire from the Go post terminates somewhere sane in Elixir. And because Phoenix is right there, streaming a model’s tokens to a live UI is not a separate websocket stack you stand up — LiveView already holds the connection, and async assigns push tokens as they arrive. The “simplest real-time AI UI” is close to free when the agent and the UI live in the same supervised tree.

I’ll keep the scope honest: this ecosystem is younger and smaller than Python’s, and you will occasionally be the first person to hit a rough edge. But the primitives — processes, supervisors, message passing — are thirty years mature. The agent libraries are thin, sane layers over a deep foundation, which is the opposite of the usual situation where a slick library hides a shaky core.

What it costs you

Here’s the part the title doesn’t say, the same way the Go post owed you the costs of choosing Go. The BEAM is weak in exactly the place Python is strong, and pretending otherwise is how you talk yourself into a bad architecture.

The model and ML layer is not the BEAM’s. If your agent needs to run inference locally — embeddings, a local model, real tensor math — you are swimming against the current. Nx and Bumblebee exist and are genuinely impressive work, and you can run Whisper or a Llama-class model from Elixir today. But the frontier of models, the day-one SDKs, the research code, the eval tooling, the sheer gravity of the ecosystem — that’s all Python, and it will be for years. If the center of mass of your system is the model itself rather than the orchestration of agents around it, you are buying the wrong runtime to save the wrong cost.

The talent pool is smaller. You will hire Elixir engineers more slowly than Python or Go engineers, full stop. For a lot of teams that single fact outweighs every architectural elegance in this post, and it should — the most pragmatic stack is frequently the boring one you can staff. Be honest with yourself about whether you’re optimizing for the system’s properties or for your own enjoyment of them.

You’re calling out for the model anyway. In the overwhelmingly common case, your “model call” is an HTTPS request to a hosted API. That’s true in every language, which means the BEAM’s weakness at local inference is irrelevant to most production agents — but it also means the model layer isn’t where your language choice pays off, so don’t let “but Python has the SDKs” decide a system whose hard problem is supervising ten thousand stateful sessions, not calling an API.

Notice the costs all cluster in the same place — the model/reasoning layer — exactly as Go’s costs all clustered in its reasoning layer. The BEAM’s weaknesses are Python’s strengths. Which is the entire point of the next section.

The split that completes the trilogy

The Go post ended with a two-way split. With the third layer named, it’s a clean three-way one, and the seams fall in obvious places:

Layer	What it is	Right tool	Why
Transport / daemon	MCP server, CLI, the thing users install	Go	One binary, near-zero deps, cheap concurrency — its home turf
Reasoning / model	Prompts, evals, inference, ML glue	Python / hosted API	REPL loop, the SDKs, the entire ML ecosystem
Stateful supervision	Long-lived agents, session state, crash recovery, fan-out	BEAM (Elixir)	Isolated supervised processes are the native unit

The decision rule, not the language-war version: pick your runtime by where your system’s hard problem actually lives. If the hard problem is “ship a tool into a thousand machines,” that’s Go. If it’s “iterate on prompts and run a model,” that’s Python. If it’s “keep an enormous number of independent, stateful, failure-prone agents alive and isolated for hours,” that’s the BEAM, and it’s not close. Most real systems are more than one of these, and the mature answer is the same as it was for daemon-versus-reasoning: don’t force one language across a seam it doesn’t belong on. Let the BEAM supervise, let Python think, let Go ship the binaries, and put wires between them.

The part everyone’s been arguing past

The whole “what language for AI agents” debate has been a fight about the two visible layers — the binary you install and the prompt you tune. Both matter and both have clear winners. But the layer that actually decides whether your agent platform survives contact with production isn’t either of those. It’s what happens at 3 a.m. when four hundred agents are mid-task and the model API starts returning garbage: do four hundred sessions die, or does one runtime quietly crash-and-restart each failed step from its last checkpoint while everything else keeps running?

That problem — many isolated stateful things failing independently and recovering without a global blast radius — was solved, productized, and battle-hardened for telephone switches before most of us were writing code. The agent era didn’t create a new hard problem at the orchestration layer. It walked straight into an old one that already has a famously good answer. The only surprising thing is how few people building agents have noticed that the runtime they want already exists, has for decades, and is sitting one mix new away.

Use the BEAM for the thing the BEAM is for. Let it hold the agents. Just don’t ask it to run your model — keep that where it belongs, on the other side of a wire, in the language built for it.

All writing