We inverted a startup race — and found four static-vs-dynamic asymmetries to close.
loomcycle ships as a Go-binary sidecar. Operator app on one side, agents on the other. The two communicate over MCP. The substrate's whole proposition is that the runtime owns the loop — your app brings the business logic; loomcycle brings the agent machinery.
Which means at startup we have two containers. The agent runtime (loomcycle) and the consuming application (in JobEmber's case, the jobs-search-web Next.js service that exposes /api/mcp). They start independently. Each container's healthcheck is its own; orchestration doesn't enforce a startup order. If a substrate primitive on one side has a dependency on the other side being live, your container architecture is wrong.
Yesterday's cv/cl-adapter failure was that exact wrong. Agents started "narrating" tool calls — printing them as text instead of executing them — because mcp__jobs__* wasn't in their catalog. The cause was a static mcp_servers.jobs: block in loomcycle.yaml. The cure was inverting the dependency to dynamic MCPServerDef registration. The cure took 8 commits across two repos because inverting one consumer's wiring exposed four distinct static-vs-dynamic asymmetries the substrate had been hiding.
This post is the autopsy. The chicken-or-egg that started it, the four asymmetries we closed, and the underlying lesson worth taking to the next consumer that hits this pattern.
The chicken-or-egg the static yaml block hid
A static mcp_servers entry in loomcycle.yaml has a property you don't notice until it bites you: loomcycle takes ownership of the discovery handshake at its boot time. It reads the yaml, sees jobs, runs an MCP tools/list against /api/mcp with whatever credential the entry's header resolves to, and freezes the result in its boot-time tool map.
That handshake happens with no run context. There's no run id, no user, no bearer; loomcycle is just bootstrapping. The static yaml block had been working only because we'd been getting lucky with three things at once: the MCP-providing service happened to be up first; the header's fallback env token happened to be set; and loomcycle's boot-time tool catalog happened to capture the tool list before any real agent ran.
The luck ran out on 2026-06-02. The cv/cl-adapter run hit a deployment where:
- loomcycle came up first (no orchestration constraint forced an order).
tools/listathttp://jobs-search-web:3000/api/mcpreturned a TCP error (the service was still starting) — fail-soft → empty tool list.- By the time JobEmber was live and the agent ran, loomcycle's frozen boot-time catalog said
mcp__jobs__*didn't exist. - The model — given the run's
allowed_toolslist with names that resolved to nothing — narrated the tool calls as plain text. The agent looked broken from the outside; it was actually behaving correctly given an empty catalog.
The fix direction was obvious: invert the dependency. The MCP-providing service knows when it's live; let it register its own MCPServerDef at its startup, when the handshake will actually succeed. That's the pattern every other content-addressed Def primitive uses (AgentDef, SkillDef, ScheduleDef, WebhookDef, MemoryBackendDef, OperatorTokenDef — see yesterday's RFC L post on the seventh primitive). MCPServerDef was the only one consumers were leaving in the static-yaml path because the dynamic path "wasn't worth the ceremony." Yesterday made the ceremony worth it.
The inversion looked like one PR. It turned out to be four asymmetries the dynamic path had been silently hiding.
Asymmetry #1 — the private-host allowlist was checked twice on the static path but once on the dynamic path
loomcycle · PR #340Dynamic MCPServerDef.create only checked LOOMCYCLE_HTTP_HOST_ALLOWLIST, not LOOMCYCLE_HTTP_PRIVATE_HOST_ALLOWLIST.
loomcycle defends against SSRF by gating outbound URLs through a host allowlist. There are two allowlists: a public one (LOOMCYCLE_HTTP_HOST_ALLOWLIST) and a private one (LOOMCYCLE_HTTP_PRIVATE_HOST_ALLOWLIST) that operators use to bless loopback / RFC1918 callbacks. A self-hosted http://localhost:3000/api/mcp needs the private allowlist.
The static-yaml path bypassed the allowlist entirely (operator-authored yaml is trusted). The dynamic create path enforced the allowlist, but only the public one. So MCPServerDef.create refused a perfectly-authorized loopback URL fail-soft → the mcp__jobs__* tools never registered, and the failure didn't surface as a hard error.
The fix was three lines: also check LOOMCYCLE_HTTP_PRIVATE_HOST_ALLOWLIST at create time. The honest residual: a static yaml block can talk to a loopback URL without the operator naming it in any allowlist; a dynamic create can't. We could equalize by either tightening the static path (which would break existing operators) or relaxing the dynamic one (which would expand the SSRF surface). Equalizing through the private allowlist — operator opt-in, surfaced in doctor output — is the honest middle.
Asymmetry #2 — dynamic MCP tools were callable but not resolvable
loomcycle · PR #341, then PR #345The lazy tool resolver consulted only the static-yaml server map. Tools registered via the substrate were unknown at dispatch time.
After PR #340 unblocked create, we got a clean dynamic registration: the MCPServerDef row existed, discovered_tools was populated, the dynamic registry knew about jobs. Then the cv-adapter agent fired and got tool not found: mcp__jobs__postResearchIngest. Not "tool refused," not "scope-denied" — not found, the dispatcher's bare miss path.
Tracing it: LazyResolver.Resolve consulted a private serverConfig map[string]ServerCfg populated from cfg.MCPServers at boot. The shared DynamicRegistry that the pool's connection-building callback used was a different source. The membership check and the connection check were reading different sets. A server present in the dynamic registry but absent from the boot-time cfg map fell through to "not found."
The fix (#341) threaded DynamicRegistry into NewLazyResolver. The cleanup (#345) was the more interesting follow-up. We already had lookup.MCPServer — a static-yaml-then-dynamic-substrate resolver that Webhook, MemoryBackend, A2A, Agent, and Skill all routed through. It had been written for this exact pattern. It had zero callers. The MCP path had been hand-rolling its own resolver since before the shared lookup existed, and never got migrated. The drift that caused the bug was a symptom of the orphan code; #345 wired the lazy resolver through lookup.MCPServer so it's structurally impossible to drift static-only again.
Worth flagging as a pattern. When a system supports static-yaml-and-dynamic-substrate paths for a primitive, a shared resolver that handles the precedence is almost always the right factoring. If you have one and it has zero callers, that's a warning sign — someone bypassed it because their seam looked "simpler." Two months later their seam will have drifted, and the drift will be exactly the bug you can't reproduce locally.
Asymmetry #3 — every registration minted a new version (the SHA-dedup pattern that AgentDef had and MCPServerDef didn't)
loomcycle · PR #343 · @loomcycle/client · PR #344One jobs server's lineage had grown to 19 versions in days. Every consumer restart minted a new identical version.
With the resolver fixed, dynamic registration started working — and we noticed the consumer's restart cycle was spamming new MCPServerDef versions. Each boot of jobs-search-web ran ensureMcpServer, which ran create → rediscover, which minted a fresh version of the same content. In two days one jobs server's lineage had grown to 19 versions.
AgentDef and SkillDef solved this in v0.9 with content-addressed dedup: hash the canonical wire content into content_sha256; on create, if the active row already has that hash, return {deduplicated: true} instead of minting a new version. The consumer can re-create on every boot without paying a version-spam cost. MCPServerDef had content_sha256 populated but the create path never compared it.
The fix had three parts:
- Server-side dedup on the create path — if the active def already carries this exact content (
{name, description, transport, url, headers}), return it withdeduplicated: trueinstead of minting a byte-identical new version. Compare against the active row only — re-creating content that matches a non-active version still mints + promotes (re-activation is a real state change). - Server-side dedup on rediscover —
content_sha256excludesdiscovered_tools(the tools are a function of the upstream, not the operator-authored content). So a freshtools/listthat matches the active def's discovered tools would otherwise mint a new version every boot. Compare canonically (order- and JSON-whitespace-insensitive) and skip the new version when unchanged. - Consumer-side: send byte-identical content every boot — the consumer's
jobs-search-webhad been baking a per-process discovery token into the header at register time. Replace it with a stable literal template (Bearer ${run.credentials.jobs:-${LOOMCYCLE_JOBS_SEARCH_API_TOKEN}}) that loomcycle substitutes at request time. Identical header content across restarts → identicalcontent_sha256→ server-side dedup engages.
And while we were there, we shipped the typed ergonomics (#344, @loomcycle/client 0.18.0): ensureMcpServer({name, url, transport?, headers?, rediscover?}) as the typed convenience for exactly this re-register-on-every-boot pattern. Returns {defId, version, changed, discoveredToolCount?} — changed: false once the registration content is stable, so a per-boot re-register is a clean no-op the consumer can log only on real mints. Plus mcpServerDefVerify(name, content_sha256) for callers that want to check before they create.
Three coordinated PRs to close one asymmetry. The pattern AgentDef had had since v0.9 is now uniform across substrate primitives.
Asymmetry #4 — the bearer-token variable was interpreted literally
loomcycle · PR #348Dynamic registration stored ${LOOMCYCLE_…} verbatim; the request-time substituter sent literal Bearer ${LOOMCYCLE_…} upstream → hard 401.
The dedup fix (#343) plus the consumer's stable-literal header (above) should have closed the story. Instead the upstream returned 401: Invalid bearer token on every discovery handshake. The header that left loomcycle's wire was — literally — Authorization: Bearer ${LOOMCYCLE_JOBS_SEARCH_API_TOKEN}.
This is the most subtle bug of the four. It traces to two correct behaviors composing into a wrong outcome.
Behavior 1. A yaml-configured MCP server gets its header expanded at config.Load. The whole document passes through expandEnv, which resolves any ${LOOMCYCLE_*} against the process environment using a prefix-allowlisted scope. So on the static path, a header like Bearer ${run.credentials.jobs:-${LOOMCYCLE_JOBS_SEARCH_API_TOKEN}} arrives at the substrate as Bearer ${run.credentials.jobs:-acme_…actualtoken} — the inner ${LOOMCYCLE_*} is resolved at load, only the outer ${run.*} survives to request time.
Behavior 2. The request-time substituter (substitute.go) only resolves ${run.*}. Its :- fallback regex is .*? (lazy) — and there's a comment in the code that explicitly says "this works because the inner var has already been resolved by config.Load by the time we get here." If the inner var hasn't been resolved, the lazy regex matches the shortest possible default, which means it stops at the first } — the closing brace of the inner ${LOOMCYCLE_*}, not the outer envelope. The result is a malformed substitution that emits the literal nested template.
Dynamic MCPServerDef registration bypasses config.Load entirely. The header arrives at the substrate verbatim, with the inner ${LOOMCYCLE_*} still in nested form. The request-time substituter then trips its own comment's precondition and ships the literal template upstream. Hard 401.
Two correct code paths. One precondition the documentation acknowledged but the dynamic path silently violated.
The fix is the right shape — make the dynamic path uphold the precondition the yaml path always did. Export config.ExpandEnv (a thin wrapper over the existing expandEnv + its LOOMCYCLE_* allowlist) and call it on the operator-authored connection fields (url + header values) inside MCPServerDef.buildDefinition — the shared chokepoint that both create and fork route through. The stored header becomes flat; request-time substitution now behaves identically for yaml-configured and substrate-registered servers. The outer ${run.*} token carries a . in its name that envVarRe cannot match, so it flows through untouched.
One residual worth naming: config.ExpandEnv at buildDefinition bakes the resolved token into the stored def content (and thus content_sha256). That's consistent with yaml semantics — the static path also bakes resolved env into the stored shape. Server-side dedup still engages as long as the env value is stable across boots (the typical case; secret rotation is a real content change and minting a new version is correct then).
With #348 landed, the post-2026-06-02 consumer had a follow-up cleanup: the original :-${LOOMCYCLE_JOBS_SEARCH_API_TOKEN} fallback was a workaround for a problem that no longer existed. PR #341's lazy resolver now threads the run ctx into the handshake, so the only context-free handshake (the eager rediscover, which we no longer trigger) is gone. The header simplified to bare Bearer ${run.credentials.jobs}; the discovery-token machinery (and the data-less svc-mcp-discovery user it required) was deleted entirely.
Bonus close-outs: tool catalog advertising + scheduler bootstrap
Once we'd named the static-vs-dynamic asymmetry class, two adjacent gaps surfaced and got closed in the same session.
Tools were callable but not advertised (PR #347)
Dynamic MCP tools were callable via the lazy resolver after #341 — but never advertised in the model's tool catalog. The per-run allowed_tools filter ran against s.tools, frozen at boot. So an agent could only use a post-boot tool if it already knew the name; the model never saw it in the catalog. Same asymmetry class, advertising-side. Added Server.SetDynamicToolEnumerator and a candidateTools(ctx) helper that folds substrate-registered tools into the boot set before the allowed_tools filter. All four run-creation paths (RunOnce, /v1/runs, /v1/sessions messages continuation, runSubAgent) route through it.
Static yaml schedules never auto-fired (PR #346)
Same pattern, scheduler side. The sweeper's due-query is substrate-only (schedule_run_state ⨝ active ⨝ defs). Nothing seeded cfg.ScheduledRuns into it at boot, so a yaml scheduled_runs: entry that was never forked or run-now'd had no run-state row and silently never fired. Dynamically-created schedules already fired (promoted create seeds the run-state). Added ScheduleDef.BootstrapStaticSchedules — idempotent, fork-respecting, invoked once before the sweeper starts.
The unifying lesson — and a checklist for substrate primitives
Every substrate primitive in loomcycle now has two integration paths: yaml-loaded (operator authored config that lives next to loomcycle.yaml) and dynamically-created (the Def primitive, content-addressed, versioned, the path the substrate is actually designed around). Each seam between the substrate and the runtime must work the same on both paths. The side nobody exercises silently rots.
Yesterday's four asymmetries were all in the dynamic path because every existing consumer had been sticking to the static path "for simplicity." The static path looked simpler because it hadn't surfaced the seams the dynamic path forced you through. Once one consumer inverted, the four seams that hadn't been kept symmetric all surfaced in 24 hours.
So the discipline going forward, for any new substrate primitive or any new integration seam:
- Host allowlists — both
LOOMCYCLE_HTTP_HOST_ALLOWLISTandLOOMCYCLE_HTTP_PRIVATE_HOST_ALLOWLISTmust be checked at every create boundary; static-yaml's bypass is the privileged-trust exception, not the default. - Runtime resolution — every primitive resolves through the shared
internal/lookuphelper; static-vs-dynamic precedence lives in one place and is structurally hard to drift. - Tool catalog advertising — anything addable post-boot must fold into the per-run catalog before the
allowed_toolsfilter. - Content-addressed dedup — every primitive's
createcompares against the active row'scontent_sha256and returnsdeduplicated: trueon match. Consumers send byte-identical content (no per-boot baking of resolved tokens). - Env expansion — wherever
config.Loadwould runexpandEnvon a static-yaml field, the dynamic create path must runconfig.ExpandEnvon the same field atbuildDefinition. The shared chokepoint is the right home; per-callsite expansion will drift. - Sweeper / scheduler bootstrap — anything the runtime auto-fires must have an explicit bootstrap step that seeds static-yaml entries into the same substrate state the dynamic path produces. Otherwise static entries never fire.
Each of these is a one-line discipline at design time. None of them are hard once you've named the class. The class is "static-vs-dynamic asymmetry," and the symptom is always the same: one consumer's seemingly-orthogonal change exposes a runtime seam the other consumers never made the substrate walk through.
What this means for you, today
If you're running loomcycle on the v0.17.x line and have a static mcp_servers: block in loomcycle.yaml for an MCP server provided by another container in your deployment: consider inverting it. The @loomcycle/client 0.18.0 ensureMcpServer sugar makes it a one-call registration at your startup:
// At your service's startup, when /api/mcp is live:
import { LoomcycleClient } from "@loomcycle/client";
const lc = new LoomcycleClient({ baseURL, bearer });
const { defId, version, changed } = await lc.ensureMcpServer({
name: "jobs",
url: "http://jobs-search-web:3000/api/mcp",
transport: "http",
headers: {
Authorization: "Bearer ${run.credentials.jobs}",
},
rediscover: true,
});
if (changed) console.log(`registered jobs MCP v${version}`);
// `changed: false` on every subsequent boot — server-side dedup absorbs the no-op.
Add the /api/mcp host to LOOMCYCLE_HTTP_HOST_ALLOWLIST (or LOOMCYCLE_HTTP_PRIVATE_HOST_ALLOWLIST for loopback / RFC1918), keep the header as a stable literal template (don't pre-resolve tokens consumer-side), drop the static block from loomcycle.yaml. The chicken-or-egg is gone. The consumer owns the registration; the substrate owns the dedup, the env expansion, the runtime resolution, the tool advertising.
Companion reading: Multi-tenant authorization and the four bugs adversarial QA caught (yesterday morning's v0.17.0 ship — the seventh substrate primitive, and the four authorization gaps that surfaced in the same kind of adversarial pass) and From Go-bundled to JSON-pluggable (the move that made MCP tools a substrate concern in the first place).