§ substrate note

Three MCP tokens in one run. The agent never sees a single one.

2026-05-28 · by Dennis Gubsky · ~6 min read

JobEmber.ai's nightly autonomous job-search agent does three things you'd expect of an agent. It ingests fresh listings from a private jobs API; it publishes a tidy summary into the user's Slack; it DMs the highlights to the user's Telegram. Three downstream services, three MCP servers - and three different bearer tokens, all per user. Alice gets the JobEmber.ai-scoped jobs bearer, Alice's xoxp- Slack OAuth, Alice's Telegram bot token. Same Alice, three credentials, one agent run.

In v0.8.14, the substrate carried exactly one of those at a time. A ${run.user_bearer} template inside a mcp_servers.headers block substituted Alice's session bearer into the outgoing MCP request. Beautiful for the single-server case. Useless the moment the agent fans out across three.

That was the aha. An agentic team isn't one agent calling one tool - it's one run reaching across multiple MCP servers, each with its own per-user authorization story. The single-bearer model was a special case of a thing we hadn't generalised yet. So we generalised it.

The wrong fix that's tempting

The shortcut almost every framework takes: give the tokens to the agent. Pass {jobs, slack, telegram} in the system prompt, or as a tool result, or as memory entries the agent can read. The agent decides which token to attach where. It requires no substrate work. It is also, on reflection, indefensible:

The agent's transcript - events rows, snapshot bodies, replay debugging - now contains three secrets in plain text. Forever, until you remember to scrub them.
Every tool result the agent processes is a prompt-injection surface. A Slack message someone DM'd Alice, a job description scraped off the open web, an attacker-controlled string buried in any of dozens of tool outputs - any of them can ask the model to print its credentials. The model will sometimes oblige.
Sub-agents inherit the parent's context by design. Every fan-out broadens the exposure surface. JobEmber.ai's job-search-batch spawns five workers; that's five copies of the token set, each in its own transcript, each with its own injection surface.
OTEL spans, distributed traces, the export to your observability backend - they'd all see the secrets too. Half the point of a trust boundary is that crossing it should require intent, not amnesia.

The right primitive is the opposite stance. The agent doesn't get the tokens. The substrate holds them, substitutes them at the HTTP wire boundary right before the MCP request leaves the binary, and evicts them when the run ends. The agent sees the consequence - the Slack call succeeded, the Telegram call succeeded, the run completes - but never the values themselves.

The shape - a named map, on the wire

PR #262 extends POST /v1/runs (and the matching gRPC, MCP spawn_run, and TypeScript adapter surfaces) with a user_credentials field - a map keyed by operator-chosen names:

POST /v1/runs
Authorization: Bearer <operator-token>
{
  "agent":   "job-search-batch",
  "user_id": "[email protected]",
  "user_credentials": {
    "jobs":     "<JobEmber.ai-bearer-for-alice>",
    "slack":    "xoxp-<alice-slack-oauth>",
    "telegram": "<telegram-bot-token-alice-chat>"
  },
  "segments": [ ... ]
}

Convention is to match the map keys to your mcp_servers.<name> yaml entries. Each server's header template references its credential by name:

mcp_servers:
  slack:
    url: https://slack.example/mcp
    headers:
      Authorization: "Bearer ${run.credentials.slack}"
  telegram:
    url: https://telegram.example/mcp
    headers:
      Authorization: "Bearer ${run.credentials.telegram}"
  jobs:
    url: https://jobs.internal/mcp
    headers:
      Authorization: "Bearer ${run.credentials.jobs}"

The substitution is non-mutating and per-request - the in-memory headers map is never touched; each outgoing call rebuilds its own substituted copy. The credential map is stored on the run's RunIdentityValue, propagates to every sub-agent via the same ctx channel that already carried user_bearer, and is evicted from memory when the run completes. The expansion happens inside Client.do(), exactly once, right before the request goes on the wire.

What the agent can see (nothing)

The trust boundary is the point of the exercise. Five things hold by construction:

No introspection surface. Context.self doesn't return the credential map. There is no Credentials.list or Credentials.get built-in tool. The map exists in RunIdentityValue, but no agent-visible path leads to it.
No transcript persistence. Credential values never appear in the events table, in snapshot bodies, or in any other durable state. The map lives in process memory for the duration of the run; it's gone when the run finishes.
No OTEL exposure. Decision 5 of the RFC inherits the secret-exclusion posture from v0.10.0's internal/otel/recorder.go design - span attributes never carry the values. If you ship spans to Tempo, Honeycomb, or Datadog, the credentials don't ride along.
Sub-agent inheritance is automatic and invisible. Workers spawned by Agent.parallel_spawn inherit the parent's RunIdentityValue through ctx. They make their MCP calls; their ${run.credentials.<name>} templates resolve correctly. The agents themselves are not handed a token-passing protocol.
Missing credential drops the header. ${run.credentials.notconfigured} resolves to empty, and the entire header is dropped from the outgoing request. A tracing event fires with kind=credential_missing for operator triage. The upstream returns 401 or 403 the normal way; the substrate doesn't pre-validate or surface internals.

The values themselves appear in exactly one place across the whole lifecycle: in the outgoing MCP-server HTTP request payload, as the substituted header value, microseconds before that request leaves the binary. They are not in our logs. They are not in our database. They are not in any export. They are in the wire payload to the system that's supposed to receive them, and nowhere else.

The v0.8.14 escape hatch (sugar, not a shim)

Callers on v0.8.14's single user_bearer field keep working unchanged. At RunIdentityValue construction the substrate applies one piece of sugar: if UserBearer is non-empty and UserCredentials["default"] is empty, promote UserBearer into the map under the default key. So ${run.user_bearer} still resolves identically, and yaml that migrates to ${run.credentials.default} works against either old or new callers. No coordinated upgrade, no deprecation cycle, no migration step. The map is a strict generalisation of the bearer.

One narrow exception is honest: HTTP-transport MCP servers get full per-request substitution, but stdio MCP servers are spawned once at pool start, which means per-call credential injection on stdio would require pool respawning or credential-keyed pooling. We made that out-of-scope for this RFC; operators with per-user stdio auth needs use the HTTP transport, or bake operator-env per-server. Documented as a known limitation in Context.help per-run-credentials.

Why this RFC has a twin

This started as one design and ended up as two. The on-demand path - Alice clicks "Search Jobs" in JobEmber.ai's web UI, JobEmber.ai's backend POSTs /v1/runs - carries the credential map on the wire. The scheduled path - Alice's autonomous nightly auto-search at 3 a.m., no one there to POST - needs the same map stored as substrate state, on a schedule definition, so the scheduler can spawn the run with the right credentials at the right time. Same data shape, two callers, two implementations.

So we wrote two RFCs. F (this one) opens the credential map to the wire. E (next, ScheduleDef as a substrate primitive) builds on F's plumbing - its scheduled forks store the same map and feed it into RunIdentityValue.UserCredentials at spawn time, identical to how the HTTP path populates it. F shipped first because E's storage consumes what F's wire passes; the canonical surface is the same RunIdentityValue.UserCredentials field, with one shared substitution function. The first slice of E (the ScheduleDef tables, store interface, and per-name advisory locking) landed alongside F today; the agent-facing tool, sweeper, on-complete dispatch, and Web UI are the next round.

The shape that emerged is the lesson. "Agents will need exactly one bearer per run" is an assumption that ages out quickly. The moment your agentic feature wants to reach across two services with independent user-auth stories - Slack and your own API, GitHub and Notion, an internal CRM and a public LLM gateway - the single-bearer model breaks, and the right fix is to generalise the surface, not to teach the agent how to juggle secrets.

What you can do with it today

If you've shipped a loomcycle deployment on v0.12.x, the new field is in the wire shape now: pass user_credentials on your POST /v1/runs, reference ${run.credentials.<name>} in your mcp_servers.*.headers, watch your fan-out agents authenticate cleanly against three different upstreams in the same run without learning a single token. The TS adapter has userCredentials?: Record<string, string> on runAgent and runAgentStreaming; the MCP spawn_run tool's input schema has the matching user_credentials object field. Twenty-four new tests pin the substitution matrix and the back-compat sugar across all four transports.

Three real per-user tokens, one agentic run, zero of them visible to the agent. The substrate handles the trust boundary; the agent worries about what its tools do, not which key opens which door. The next post covers the same map on the scheduled side - what forking a schedule means when Alice rotates her Slack OAuth, and how ScheduleDef (the substrate twin) makes 3 a.m. autonomous runs feel like the same wire shape, just earlier.

Companion reading: When the agent is in one container and its definition is in another (the content-addressed substrate this RFC plugs into), and Route agents by data sensitivity (the per-agent provider story that pairs with the per-run credential story).