Cloudflare Dynamic Workers: Fast Sandboxes for AI Agents
Cloudflare released Dynamic Workers, isolated environments for running AI agent code. 5 ms startup, 100x faster than containers.

Cloudflare recently released Dynamic Workers into open beta - fast sandboxes for AI agents. The company claims they run 100x faster than traditional containers when executing AI-generated code. That’s a serious claim. Let’s see what’s behind it.
Containers Are Too Heavy for AI Agents
Modern AI agents don’t just answer questions, they can write and execute code on the fly. An agent might generate a small script to call an API, process some data, or complete a task. That code needs to run somewhere. For security reasons, it’s best to run it in a sandbox, isolated from the rest of the system.
The standard approach today is to spin up a container for each task. Containers work, but they come with problems:
- Cold start takes hundreds of milliseconds.
- Each container consumes a lot of memory.
- To avoid delays, you keep containers warm, but then you’re paying for them even when they’re idle.
- To save money, you reuse containers across tasks, which weakens isolation and creates security issues.
When your system has a few dozen users, this is usually not a problem. But if you’re building a product where millions of requests simultaneously trigger AI agents, containers become a bottleneck.
V8 Isolates Instead of Containers
Cloudflare Dynamic Workers skip containers entirely. Instead, they use V8 isolates. It’s the same JavaScript engine that powers Google Chrome and the entire Cloudflare Workers platform (which has been running on this technology for eight years).
Some numbers to put things in perspective:
| Container | Dynamic Worker | |
|---|---|---|
| Startup time | ~500 ms | ~5 ms |
| Memory usage | Hundreds of MB | A few MB |
| Concurrency limits | Often capped | Unlimited |
| Location | Separate machine | Same machine, same thread |
A Dynamic Worker spins up in milliseconds, runs the code, returns the result, and disappears. No warm pools, no idle costs, no reuse compromises.
Here’s a simple analogy: spinning up a full container for a quick AI task is like renting a five-room apartment to make a single phone call. A Dynamic Worker is the phone booth. You step in, make the call, step out, done.
How It Works
You have a parent Worker (your main application) that receives requests. When it needs to execute AI-generated code, it calls the Dynamic Worker Loader API, which creates an isolated environment on the fly.
Say you’re building a customer support chatbot. A customer asks a complex question that requires calling multiple APIs. The LLM generates a short TypeScript function, and the parent Worker runs it in a sandbox:
// Parent Worker receives a customer query
const worker = env.LOADER.load({
compatibilityDate: "2026-03-01",
mainModule: "agent.js",
modules: { "agent.js": llmGeneratedCode },
// Give sandbox access only to the order API
env: { ORDER_API: env.ORDER_SERVICE },
// Block all other internet access
globalOutbound: null,
});
// Execute and get the result
const result = await worker.getEntrypoint().handleQuery(customerId);
The sandbox can only access what you explicitly pass to it. No internet access, no database credentials, no way to reach anything outside the defined interface. Even if someone tricks the LLM into generating malicious code, the sandbox limits what it can do.
Why Generate Code at All?
For a simple query like “where’s my order?” you could write a handler in advance. Pre-written code is faster, cheaper, and far more predictable. So when does dynamic code generation actually make sense? Here are a few examples:
Unpredictable queries. A customer asks: “I ordered a blue jacket on Tuesday, then changed it to black and also added a scarf, can you check if those changes went through and tell me when everything arrives?” No pre-built handler covers that combination. But an LLM can figure out the right sequence of API calls and compose them on the spot.
Large API surfaces. If your platform has 200+ endpoints covering orders, returns, payments, subscriptions, and loyalty programs, maintaining pre-built handlers for every possible query is a huge amount of engineering work. With code generation, you just describe the typed API and the LLM assembles the right calls on demand.
Multi-tenant logic. Different clients have different rules. One retailer accepts returns for 30 days, another for 14, a third only for certain categories. Instead of building a complex rules engine that covers every variation, the LLM generates code that applies the correct rules for each context.
Changing integrations. You add a new shipping carrier or payment provider. With pre-written handlers, that means updating code, testing, deploying. With code generation, you update the TypeScript API definition and the LLM picks up the changes immediately.
In practice, most teams would use both approaches: common queries are handled by pre-written code, while complex or unusual queries get routed to the LLM. You only pay for code generation when you actually need it.
The “Code Mode” Idea
Dynamic Workers are part of a broader concept that Cloudflare calls “Code Mode.” Instead of having an LLM make many sequential tool calls (call API A, wait, call API B, wait, call API C), the agent writes a single script that chains everything together, executes it in one pass, and returns only the final result.
Cloudflare claims this approach cuts token usage by 81% compared to sequential tool calls. Their own MCP server works exactly this way: it exposes the entire Cloudflare API through just two tools (search and execute) in under 1,000 tokens. The agent writes TypeScript against a typed API, runs it in a Dynamic Worker, and only the result goes back to the context window. Intermediate steps never reach the LLM.
TypeScript Over OpenAPI
Cloudflare uses TypeScript interfaces instead of OpenAPI specs to describe the APIs available to agents.
A chat room API as a TypeScript interface takes about 15 lines. The equivalent OpenAPI spec runs to over 60 lines of YAML. TypeScript is more compact, easier to read for both humans and LLMs, and burns fewer tokens per API call. If you have millions of agent interactions, that difference becomes very noticeable on the bill.
Companion Libraries
Alongside the runtime, Cloudflare released three npm packages:
@cloudflare/codemode wraps sandbox creation and MCP server integration; it can take an existing MCP server and collapse all its tools into a single
code()tool.@cloudflare/worker-bundler resolves npm dependencies at runtime, so agent-generated code can import third-party libraries like Hono or the Stripe SDK without any pre-configuration.
@cloudflare/shell provides a virtual filesystem inside a Dynamic Worker (backed by SQLite and R2) with typed methods for file operations. The filesystem supports transactional batch writes: if any operation in a batch fails, everything rolls back.
The Security Model
Cloudflare is quite open about where they had to make compromises. V8 security bugs are more common than hypervisor vulnerabilities. So their defense strategy has multiple layers:
- V8 security patches reach production within hours (faster than Chrome itself).
- A custom second-layer sandbox with dynamic tenant separation based on risk level.
- Hardware memory protection via MPK (Memory Protection Keys).
- Spectre defenses developed in collaboration with researchers.
- Automated code scanning that blocks malicious patterns.
The short lifespan of isolates also helps. Since they’re cheap to create and destroy on every request, there’s no temptation to reuse them across tasks, which is a common security weakness with warm container pools.
Limitations
For now, Dynamic Workers only work effectively with JavaScript. Python and WebAssembly are technically supported, but they’re significantly slower for short code snippets. A large part of the AI ecosystem runs on Python, especially data science and ML pipelines. For those use cases, this isn’t a fit.
Also, the code execution layer is tied to Cloudflare. Your databases, APIs, and LLM providers can live anywhere, but the sandbox itself runs only on Cloudflare’s infrastructure. So you’ll need a paid Workers plan to use them.
Pricing
Dynamic Workers cost $0.002 per day for each unique loaded Worker, plus standard CPU and invocation charges. During the beta, the per-Worker fee is waived, so you can experiment freely. I’d recommend starting with the documentation and getting started guide.
For one-off code generation, the execution cost is negligible compared to the inference cost of generating the code itself. An LLM call might cost a few cents. Sandbox execution costs a fraction of a cent.
Who Actually Needs This?
Not everyone.
High-volume consumer products are the primary audience. E-commerce platforms, consumer SaaS, messaging apps, large marketplaces, fintech apps with millions of users. The pattern: lots of small, independent AI tasks triggered by users at the same time.
Multi-tenant platforms are the secondary audience. Companies that let their customers build custom automations. Even if the load from each individual client is modest, thousands of tenants running their own custom code add up to serious demand. Containers get expensive at that scale. Isolates don’t.
Enterprise AI tools with a manageable number of users? Probably not the target audience. If your concurrent agent executions are measured in hundreds rather than hundreds of thousands, the difference between 5 ms and 500 ms cold start doesn’t matter. A container, a serverless function, or even running the logic directly in your workflow engine will work just fine without Dynamic Workers.
The Bigger Picture
The AI infrastructure market is splitting along workload lines. Some vendors are building long-lived agent environments with persistent memory and stateful execution. Cloudflare is moving in the opposite direction: disposable execution that appears instantly and vanishes after use.
I think both approaches will coexist, but Cloudflare is making a specific bet: that the number of companies needing millions of concurrent, short-lived agent executions will keep growing. If they’re right, Dynamic Workers could quietly become a standard building block for the next wave of AI products.
And even if not, it’s still a beautiful piece of technology looking for a wider audience.
One email when there's a new post.



