§ war story

Our MCP server authenticated everyone as me

2026-05-12 · by Dennis Gubsky · ~5 min read

We moved to MCP to stop putting bearer tokens in our agents' system prompts. The model surface got typed, the bearer left the LLM's line of sight, hallucination rates on tool calls dropped. Clean win.

A few days later we found out our new MCP integration was quietly authenticating every user's agent calls as me. Documents that belonged to a second user were stored against my user_id. Live in production for several days. The reason we found it: the second user kept complaining, and we eventually believed her.

Why we moved to MCP

Our agents had basic tools: Read, WebFetch, WebSearch, HTTP. They could fetch data from our REST API, transform it, save results back. Useful output. Two problems showed up:

The simpler models hallucinated parameters. The HTTP tool's free-form {url, method, headers, body} input schema meant agents composed REST calls from whatever the system prompt told them about our API. Tool-trained LLMs almost never invent paths from scratch, but they would occasionally swap query-param names, use the wrong content-type on PATCH, or miss a required body field. Acceptable at v0.3.x, not at v1.0.
The bearer had to live where the model could see it. To hit authenticated endpoints, the bearer needed to end up in the HTTP tool's headers arg. That means: in the system prompt, in the LLM provider's request, in the assistant's echoed turns, in the tool-call records in our events table, in every operator transcript replay. The model could paste the bearer into a follow-up assistant turn and we'd have no way to scrub it after the fact.

Same fix for both: wrap the REST API as an MCP server. Each REST op becomes a typed tool with its own JSON Schema. The model picks a tool by name and fills typed params, instead of composing HTTP calls. The bearer moves out of the model's view, into operator yaml as a substitution template the runtime injects at request-build time.

Built it. Hallucinations dropped. The bearer disappeared from every model-facing surface. Agents called mcp__jobs__getAgentContext() with empty parameters and got real data back.

The developer-token shortcut

For dev simplicity, we authorized the MCP server with a single bearer. One I'd generated for my own developer account. The operator yaml looked like this:

mcp_servers:
  jobs:
    transport: http
    url: http://localhost:3000/api/mcp
    headers:
      Authorization: "Bearer <my-developer-token>"

It worked. My agents called the MCP server, the MCP server validated the bearer against our auth registry, resolved it to my developer user_id, and forwarded calls to the REST API as me. The model didn't see the token. The token authorized the calls. We shipped.

Everything worked. We were the only ones using it.

The second user

A second user joined. She started running her own agents, and she started complaining. Agents worked sometimes and not other times. Profile fetch worked sometimes and not other times. Sometimes the agent would complete the run and the resulting document would not appear in her account. Vague complaints, inconsistent failure mode.

Same week, we'd started routing some agents through deepseek-v4-pro, others through gemini-2.0-flash-lite, others on the original Anthropic baseline. So we blamed the new models. Of course they were producing inconsistent outputs, we'd just changed the model layer. We tightened prompts, narrowed allowed_tools, watched the resolver routing logs. Our test runs got reliable again.

She kept complaining. After a few days of blaming-the-model-and-tuning, with our own runs visibly working and hers visibly not, we accepted the model hypothesis didn't explain the residual.

The database transaction analysis

So we pulled a transaction log out of the database. Every INSERT and UPDATE against the documents table from the last 72 hours, joined to the user_id that owned each row.

The pattern jumped out in thirty seconds. Documents that by every other piece of metadata should have belonged to the second user (created during runs she had initiated, referencing her profile, fetched in response to her prompts) were stamped with my user_id. Every single one. No flakiness, no inconsistency. Every agent call from every user's run was authenticating as me, the developer, and writing results into my account.

The reason was sitting in operator yaml in plain sight. The bearer was my bearer. Our MCP server's job: validate the inbound bearer, resolve to a user, forward calls to the REST API as that user. The system was working exactly as designed. It was just resolving to the wrong user on every call that wasn't mine.

For our own runs this was invisible (the bearer happened to resolve to the same user_id that triggered the run, so ownership matched). For the second user's runs: her prompts hit the agent, the agent called the MCP server, the MCP server saw a bearer that resolved to my user, and her documents landed under my account. The "flakiness" she saw was every call quietly succeeding into the wrong account.

The fix

Fix had to be: give every agent run its own per-user bearer, one that resolves to the user who initiated the run. Took a couple of long workdays.

What we built (eventually became loomcycle's ${run.user_bearer} substitution mechanism, v0.8.14): a per-run bearer field on the POST /v1/runs request shape. The caller supplies a token bound to the authenticated user. The runtime attaches it to ctx, propagates it through sub-agent inheritance, and when an MCP tool fires, the HTTP client substitutes that per-run bearer into the outbound Authorization header per the operator yaml template:

mcp_servers:
  jobs:
    transport: http
    url: http://localhost:3000/api/mcp
    headers:
      Authorization: "Bearer ${run.user_bearer}"   # was: "Bearer <my-developer-token>"

The substitution happens per-request against a local copy of the headers map, not the shared one. So two concurrent runs against the same MCP server send distinct bearers without coordination. The model still never sees the token. The runtime still does the unsafe work outside the model's view. Only change: which token gets injected into which run's outbound header. Per-user instead of per-runtime.

The shared-developer-token shortcut is now structurally impossible in the operator yaml schema. The substitution form ${run.user_bearer} is the only way to thread a runtime-resolved bearer through to an MCP call. If you don't pass a user_bearer on the run request, loomcycle drops the Authorization header entirely and the MCP server returns a clean 401. Easier to debug than the silent wrong-user resolution we had before.

Details of the substitution mechanism, sub-agent inheritance, and what the model sees on which surfaces: docs/MCP_INTEGRATION.md §3. The boundary table is the canonical artifact for "what surfaces does the bearer appear on" auditing.

Two lessons

Lesson 1: authorization paths in agentic systems are not the same shape as in human-facing ones. In a human-facing system the bearer is bound to the session of the person clicking the button. In an agentic system the bearer has to be bound to the run that the person triggered, which means it threads through the runtime layer, the tool dispatcher, the MCP transport, and back into your REST API. At every hop the temptation to just paste a static token "for now" is real. Each "for now" is a multi-tenancy bug waiting to be born.

Lesson 2: one worried second user is always better than one self-confident developer. Her complaints were the only signal that anything was wrong. Our own runs worked perfectly, and we had a coherent-sounding hypothesis ("must be the new models") that explained away the noise. If she'd been less persistent, or we'd been less willing to eventually believe her, the bug would have lived until we noticed our own database growing impossibly fast with content we didn't recognize.

Not the first auth-leak we'd had. See the $80 prequel for an earlier and more expensive incident in the same family: a forgotten code path that bypassed the runtime and inherited an API key it shouldn't have. Different shape, same lesson. Multi-tenant authorization is not what single-tenant authorization looks like with another user added.