The composable provider model

Agent specialization through filesystem composition beneath a fixed LLM harness

Working document · March 2026

This analysis is grounded in the Anthropic ecosystem: Claude Code CLI, the Claude Agent SDK, and Anthropic's terms of service as of March 2026. The structural decomposition (harness, resource provider, credential boundary) may generalize to other LLM ecosystems, but the specific composition seams, legal constraints, and authentication mechanics examined here are particular to Anthropic's products. Other ecosystems (OpenAI Codex, Google Jules, open-source harnesses like OpenClaw) make different architectural and licensing choices that would yield different conclusions. Read accordingly.

1 The stack

Every interaction with a hosted LLM passes through the same structural layers: a use case (interactive or programmatic), a harness that mediates between the user and the model, and the LLM API itself. The harness is the layer that assembles prompts, dispatches tool calls, manages context, and enforces whatever behavioral constraints exist. Claude Desktop, Claude Code CLI, the Anthropic SDK, and raw API access are all instances of harnesses, each making different tradeoffs about how much the harness manages versus what the caller provides.

Figure 1 — Initial stack decomposition

The initial temptation is to treat "configured by" and "relies on" as separate concerns, one about intent and the other about infrastructure. But this distinction collapses under inspection.

2 Configuration is a subset of resources

A system prompt is a file. MCP server definitions are JSON on disk. Permission boundaries are OS-level access controls. The harness consumes all of these as resources from whatever sits beneath it. The distinction between "configured by" and "relies on" is an artifact of thinking about intent rather than mechanism. At the mechanical level, everything the harness needs is a resource served by a provider.

This observation has a structural consequence: since the harness only needs its resource contract satisfied, the provider is substitutable. A full operating system, a gVisor sandbox, a container with mounted config volumes, or a minimal runtime that serves exactly the right files and network routes can all fill the role. The harness neither knows nor cares which one it got.

Figure 2 — Collapsed model with resource contract

This substitutability is both the flexibility feature and the attack surface. A malicious provider can present a resource surface that looks correct while intercepting or modifying every resource the harness consumes. It is the same pattern as virtualization: the guest cannot determine if the hypervisor is honest, and no amount of guest-level inspection resolves that without hardware-rooted attestation.

3 Evidence from reverse engineering

Reverse engineering of the Claude Desktop application reveals that Anthropic itself builds synthetic resource providers.² The Desktop sandbox uses gVisor with 9P filesystem passthrough, constructing a virtual filesystem stitched together from host paths, mounted volumes, and sandboxed scratch space. It is not an OS in any conventional sense. It is a composite resource surface that satisfies the harness's contract just enough for it to function. The harness cannot tell whether the files it reads came from a real ext4 partition or a 9P passthrough from a container runtime, because the 9P protocol presents them identically.

Anthropic clearly understands that the provider layer is abstract. They built one.

4 The credential boundary

The Anthropic legal and compliance terms for Claude Code impose specific restrictions on credential use. OAuth authentication for consumer plans (Free, Pro, Max) is restricted exclusively to Claude Code and Claude.ai. Using those OAuth tokens in any other product, tool, or service is prohibited. Third-party developers may not offer Claude.ai login or route requests through consumer plan credentials. Anthropic reserves the right to enforce these restrictions without prior notice.¹

This is Anthropic asserting harness identity through legal restriction rather than technical attestation.³ The credential is supposed to mean "I am Claude Code," and using it from anything else violates the terms.

Figure 3 — The attestation gap

Three things could present the same OAuth token: the real Claude Code binary on a developer's machine, a wrapper that extracted the token and replays API calls with modified system prompts, or a hostile environment that satisfies the resource contract while intercepting every tool call. The API endpoint sees identical requests from all three. The missing primitive is something equivalent to TPM-rooted attestation, where the harness can produce a measurement of its own runtime that the API can verify before accepting the credential.

5 Composability below the fixed harness

Given this analysis, consider a different design objective: keep the harness fixed. Do not replace it, do not fork it, do not wrap it. Instead, compose the world the harness sees. Claude Code reads ~/.claude/, resolves .mcp.json, walks the filesystem, shells out to whatever is on PATH, and reaches the network for API calls. Every one of those is a provider-level seam. You do not touch the binary; you shape what it lands on.

This is a clean inversion. Instead of configuring the harness (which has limited knobs), you configure the provider. The mechanism can be as lightweight as shell scripts that set up the environment before launching claude, or as structured as container definitions, nix shells, or direnv profiles that swap the resource surface per directory.

The composable provider model is not something that needs to be engineered. Unix already is one. User isolation, permission boundaries, mount namespaces, chroot, cgroups: all of it predates this conversation by decades. The question was never "can I compose the provider layer," but rather what the right vocabulary is for describing what the OS already does when a harness lands on it.

6 Legal boundaries

The Anthropic credentialing restrictions target a specific abuse pattern: extracting OAuth tokens from Claude Code and replaying them through a different product.⁴ The concern is someone building a competing client that freeloads on consumer plan rate limits.

Composing the resource surface underneath a legitimate harness invocation is not that pattern. Every developer who runs Claude Code in a Docker container, on a remote VM, in a Nix shell, or in a different project directory with a different .mcp.json is already doing a version of this. Project-scoped config, per-directory CLAUDE.md files, and different MCP servers per workspace are designed-in features of the harness, not exploits.

The line, as stated in the terms, is between using the harness and impersonating or bypassing it. Shaping the resource surface underneath a legitimate harness invocation falls on the "using it" side.

The terms state that advertised usage limits for Pro and Max plans assume "ordinary, individual usage."⁵ This phrase is not defined but gives Anthropic enforcement discretion. The constraint is not technical (number of sessions or tokens) but behavioral: does the aggregate usage pattern look like one person thinking, or like a workload? The human attention bottleneck is, in effect, the licensing constraint.

7 The credential as fixed point

Claude Code resolves credentials through a priority chain: cloud provider environment variables first, then ANTHROPIC_AUTH_TOKEN, then ANTHROPIC_API_KEY, then an apiKeyHelper script, and finally subscription OAuth from /login. On macOS, credentials are stored in the encrypted Keychain. On Linux and Windows, they land in ~/.claude/.credentials.json.

The credential is the one resource that cannot be synthesized, substituted, or meaningfully composed. It is either an OAuth token bound to a subscription identity or an API key bound to a Console organization. Both are opaque bearer tokens that Anthropic's backend validates. You can move them (mount .credentials.json into a container, redirect via $CLAUDE_CONFIG_DIR, use apiKeyHelper to fetch from a vault), but you cannot fabricate them. The credential is the provider layer's one invariant, the root of trust that Anthropic holds the other end of.

Everything else in the provider surface is a projection you control.

8 Agent as recipe

With this model fully articulated, the design pattern becomes clear. The agent is not the harness. The agent is the recipe. The harness is the execution engine. What makes Agent X different from Agent Y is not different code or a different framework. It is a different filesystem image that the same binary lands on.

A recipe is a manifest that specifies: which CLAUDE.md (behavioral instructions), which .mcp.json (tool surface), which project files are visible (filesystem scope), which binaries are on PATH (capability surface), and which environment variables are set (runtime context). You assemble that into a user context and launch claude into it. The harness boots, reads its environment, and becomes that agent.

Figure 4 — Agent recipe composition

This solves several problems simultaneously. Agent specialization without writing agent code. Reproducibility because a recipe is a declarative manifest that can be version-controlled and diffed. Isolation because each composed provider is a separate filesystem context. Disposability because the provider is ephemeral: tear it down, rebuild from the recipe, and the harness does not know the difference.

9 The directory tree supports it natively

The Claude Code configuration hierarchy was designed with layered composition in mind. The harness resolves config from global (~/.claude/) and project (.claude/ in the working directory) scopes, merging them with project settings taking precedence. Every composable surface maps to a file the harness already knows how to find:

Recipe component	File / directory	Scope
Behavioral persona	`CLAUDE.md`	Project root
Tool permissions	`.claude/settings.json`	Project
MCP servers	`.mcp.json`	Project
Subagents	`.claude/agents/*.md`	Project or global
Skills	`.claude/skills/*/SKILL.md`	Project or global
Custom commands	`.claude/commands/*.md`	Project or global
Path-scoped rules	`.claude/rules/*.md`	Project
Event hooks	`.claude/hooks/`	Project
Credential	`~/.claude/.credentials.json`	Global (fixed)

The filesystem assembly step is: create a directory, populate it with the recipe's files, optionally mount data directories into scope, and cd into it before launching claude. The harness picks up everything through its normal resolution path. Additionally, CLAUDE_CONFIG_DIR can redirect the global config root, decoupling the global layer from the UID's actual home directory without requiring separate Unix accounts.

The directory tree is not merely compatible with the composable provider model. It is the composable provider model, expressed as a filesystem convention. Anthropic built the composition seams. They framed it as "project configuration" rather than "agent recipe instantiation," but the mechanism is identical.

10 The CLI as composable SDK

The conventional framing presents two paths to agent development. The first is the SDK path: build a custom harness using the Anthropic API or Agent SDK, write your own prompt assembly, tool dispatch, context management, and orchestration logic. The second is the CLI path: use Claude Code as an interactive terminal tool. The SDK path offers full composability at the cost of building everything from scratch. The CLI path offers convenience at the cost of rigidity.

The composable provider model reveals a third path that collapses this distinction. The CLI is not merely a convenience tool. It is a full-featured harness with hooks, skills, subagents, memory, worktree isolation, tool governance, and background execution. Under API key authentication, the behavioral constraints of consumer plans ("ordinary, individual usage") do not apply. What remains is a usage-billed harness that accepts arbitrary filesystem composition beneath it.

Figure 5 — Three paths to agent composability

This reframes the SDK as unnecessary for most agent specialization use cases. But the equivalence between CLI and SDK has a boundary that must be stated precisely, because the two compose at fundamentally different integration surfaces.

The CLI composes via filesystem and subprocess. You assemble a directory, launch claude -p "do the thing", and get structured output back via stdout. Chaining tasks means writing an orchestrator that parses output between process invocations. The boundary between your logic and the agent is a process boundary: stdin, stdout, exit codes, files on disk. This works, and it works well for independent agents doing independent jobs, each defined by their filesystem surface.

The SDK composes via function calls and objects in memory. You import a library into your own Python or TypeScript process. You get an async generator yielding typed messages. You can branch on structured output, pass results between agents as in-memory data, implement conditional retry logic, and hold state across steps without serializing to disk. The boundary between your logic and the agent is a function call, not a pipe.

The Agent SDK is, internally, the CLI's agent loop extracted as a library: the same tools, the same context management, the same dispatch engine.⁶ Reaching for the SDK to build an agent is, in most cases, stripping out the CLI's composition infrastructure (CLAUDE.md resolution, .mcp.json discovery, hooks, skills, the settings hierarchy) and then rebuilding a subset of it in application code. For recipe-level composition, that is wasted effort.

The honest decomposition: the CLI is the right choice when composition is at the recipe level (different agents doing different independent jobs, each defined by their filesystem surface). The SDK is the right choice when composition is at the orchestration level (agents whose outputs feed into other agents' inputs within a single programmatic workflow, with branching logic that cannot be expressed as sequential process invocations). The recipe assembler plus CLI covers the first case. The SDK covers the second. Most people reaching for the SDK are paying the complexity tax of building a harness because they have not recognized that the harness they need already ships as a binary.

The practical consequence for recipe-level composition: under API key billing, the CLI becomes a composable agent engine. Fifty concurrent claude -p instances, each assembled from a different recipe, each working on a different task, each billed per token. No behavioral constraints, no "ordinary usage" questions, no attestation concerns. The only limit is the bill. The recipe assembler (the equivalent of terraform apply for agent definitions) is the missing tool, and it is a small one: read a manifest, assemble a directory, set environment variables, launch claude.

The CLI is not the SDK. But for recipe-level agent composition, the CLI is the better tool: it includes the full harness, the filesystem composition interface, and the runtime governance that the SDK leaves you to rebuild. The SDK is the escape hatch for in-process orchestration, not the main path.

11 Summary

The stack decomposes into four layers: use case, harness, resource provider, and LLM API. Configuration is not a peer to infrastructure; it is a subset of the resources the provider serves. The provider is substitutable, which creates both architectural flexibility and an unresolved attestation gap. Anthropic addresses this gap through legal restriction on credential use rather than through technical enforcement.

Within these constraints, the composable provider model offers a practical path to agent specialization: define agent behavior as a filesystem recipe, assemble it on demand beneath the fixed Claude Code binary, and rely on the harness's native configuration resolution to pick it up. The credential is the one invariant, the root of trust Anthropic holds. Everything else is infrastructure-as-code applied to agent definition.

The result is agent-as-code by means of infrastructure-as-code, without writing a custom harness and without forking or wrapping the CLI. Under consumer plan credentials the scaling constraint is human: the model works at the rate a person can direct it, which is where the "ordinary, individual usage" boundary sits. Under API key credentials that constraint dissolves entirely, and the CLI becomes a composable SDK billed per token, with no behavioral restrictions beyond cost. The recipe assembler, a declarative manifest that describes an agent's filesystem surface and an apply step that assembles it, is the single missing primitive. It is a small one.

References

Claude Code: Legal and Compliance. Anthropic. "Authentication and credential use" section. Establishes that OAuth tokens from Free, Pro, and Max plans are restricted exclusively to Claude Code and Claude.ai, and prohibits third-party developers from routing requests through consumer plan credentials. ↑
Kotrotsos, Marco. "Two Versions of Claude Desktop". Medium, March 14, 2026. Decompilation and comparative analysis of Claude Desktop v1.1.4088 and v1.1.6452, documenting the yukonSilver VM sandbox (gVisor with 9P filesystem passthrough, minimal Linux boot images, on-demand Ubuntu bundles), 30+ feature flags, the removed Echo ambient intelligence system, and the product strategy revealed by the internal architecture. The gVisor/9P evidence cited in Section 3 derives from this work. ↑
Anthropic Bans Claude Subscription OAuth in Third-Party Apps. WinBuzzer, February 19, 2026. Documents the ToS update explicitly prohibiting OAuth tokens in third-party tools, issued in response to the OpenClaw token arbitrage pattern. Anthropic asserts harness identity through legal restriction rather than technical attestation. See also: The Register (February 20, 2026), Hacker News discussion. ↑
Anthropic Blocks Unauthorized Claude Harnesses. WinBuzzer, January 10, 2026. Reports Anthropic's server-side enforcement validating that requests originate from the genuine Claude Code binary, blocking third-party harnesses that spoofed client headers to ride subscription pricing. ↑
Anthropic Usage Policy. "Advertised usage limits for Pro and Max plans assume ordinary, individual usage of Claude Code and the Agent SDK." ↑
Claude Agent SDK Overview. Anthropic. "The same tools, agent loop, and context management that power Claude Code, programmable in Python and TypeScript." The SDK extracts the CLI's internals as an importable library, sharing the same dispatch engine and tool system. See also: Python SDK, TypeScript SDK. ↑