Agent ensembles arrive — scheduler-driven fan-out, Channel.await fan-in, and a clock for agents.
Experiment 5 ran end-to-end last night. Five RSS collectors — one each for Hacker News, Wired, Engadget, Ars Technica, TechCrunch — fired in parallel by loomcycle's scheduler every five minutes. Each collector, on success, pinged a fan-in channel via the scheduler's on_complete hook. A consolidator agent, scheduled one minute after each collector crossing, called the brand-new Channel.await {channels, mode: at_least, n: 5, wait_ms: 120000} combinator and woke when all five collectors had reported in (or the budget elapsed). It loaded 25 items from user-scoped Memory, deduped by normalized URL, wrote a consolidated record, and called mcp__telegram__send_message — Telegram returned {"ok": true, "message_id": 22}. Three cycles, then max_fires=3 auto-retired every collector schedule. Zero workarounds.
That ensemble — five parallel agents converging on a shared fan-in point, terminating after a bounded number of cycles, producing one consolidated output — is the shape loomcycle's substrate now has primitives for. Today's v0.25 "agentic-ensemble" release ships RFC S: three new primitives that close the gaps the prior multi-agent experiments worked around. Three follow-up fixes — one per follow-up release across the next 24 hours — close the rough edges exp5 surfaced as it stress-tested them.
This is the fifth post in the operator-via-MCP experiment series. exp1 exercised one agent. exp2 added human-in-the-loop. exp3 ran a 3-agent refine loop. exp4 wired a closed-loop dev workflow against real Gitea + Telegram. exp5 is the first to own the term agent ensemble — multiple agents, coordinated through substrate primitives, with a clean fan-out / fan-in / self-terminate story.
What "agent ensemble" means here
Plenty of frameworks call themselves "multi-agent." Most of them mean "you can spawn a sub-agent from a parent agent and read its return value." That's worth having, and loomcycle's Agent.parallel_spawn covered it from v0.10. But it's a sub-agent tree, not an ensemble — the parent waits for its children, the children don't talk to each other, the whole thing collapses on the parent's run.
An agent ensemble is shaped differently:
- Each agent has its own run. Independent run_id, independent lifecycle, independent retry/cancel surface, billed and OTEL-spanned independently. No parent-as-orchestrator coupling.
- Coordination flows through the substrate, not through call stacks. Channels carry pings. Memory carries shared state. Scheduling carries the fan-out trigger.
Channel.awaitcarries the fan-in barrier. None of it requires a "manager agent" to hold the others' return values in scope. - The ensemble outlives any single agent. A consolidator scheduled an hour later is part of the same ensemble as the collector that fired this morning. The substrate is the connective tissue.
- The ensemble can be authored declaratively. No "and then the manager calls these N functions" code. Five schedule defs + one consolidator def + one channel + one memory scope. The orchestration shape is data, not control flow.
The runtime is now ensemble-shaped, not just agent-shaped. That's the v0.25 framing change worth taking forward.
RFC S — the three primitives the ensemble shape needed
exp3's multi-agent loop and exp4's webhook-driven workflow each surfaced gaps in what the substrate could express. RFC S landed three primitives across #417 + #418 + #419 + #420 + the v0.25.0 promotion (de98558):
P0 — Context op=time (the agents finally have a clock)
Closes F34. Pre-RFC-S, Context had op=tools, op=doc, op=help, op=self — but no op=time. An agent that wanted to bucket events into 5-minute windows had to shell out to Bash date, which dragged Bash into its allowed_tools surface for no other reason. A consolidator that wanted to enforce a "this work is for cycle N, not cycle N-1" rule had no way to compute N.
Context op=time returns {unix_ms, rfc3339, monotonic_ns}. Cycle bucketing is now cycle = floor(unix_ms / 300000) directly in the agent's prompt. Deadline math is wall-clock subtraction. Both the collectors and the consolidator in exp5 compute the cycle from op=time — no Bash in either's allowed-tools list. The 5-minute bucket is exact.
P1 — Channel.await (the missing fan-in combinator) + Channel.broadcast (its symmetric twin)
Closes F35. Pre-RFC-S, Channel.subscribe was single-channel. There was no any/all/at_least_n combinator across multiple channels. A consolidator waiting for N collectors had to poll one channel in a loop, count distinct ids, and burn its max_iterations budget with no clean timeout. The only native AND-barrier (Agent.parallel_spawn) couples agents into a single run — exactly the coupling an ensemble is supposed to avoid.
Channel.await is one call:
Channel op=await
channels=["exp5-pings"]
mode="at_least" // or "any" | "all"
n=5
wait_ms=120000
→ { satisfied: true, // n_reached OR all/any satisfied
timed_out: false,
fired: ["exp5-pings"],
results: { "exp5-pings": [...] },
total_messages: 5 }
One round-trip, one budget, three modes. mode: any for "wake on the first ping from any channel" (race semantics). mode: all for "wake when every channel has at least one ping" (rendezvous semantics). mode: at_least with n=k for "wake when k pings have arrived across these channels" (the exp5 case). Each mode pairs with wait_ms for a hard deadline — timed_out: true is a real result, not an error.
The bonus symmetric primitive shipped at the same time: Channel.broadcast publishes one payload to N channels in one call. The pair is the natural shape — await is fan-in, broadcast is fan-out, and an ensemble that wakes on one event and signals N consumers needs both halves.
P2 — schedule max_fires (self-terminating ensembles)
Closes F36. Pre-RFC-S, a ScheduledRun fired indefinitely until the operator manually called retire. A test ensemble that wanted to run "3 cycles then stop" needed an external watcher — a cron, a tool call, a human. No way to express "this is a bounded experiment."
max_fires: 3 on the schedule def is enough. The sweeper increments a fire_count column on each successful fire; when fire_count == max_fires, the def is auto-retired on the same tick. exp5's six schedule defs (five collectors + one consolidator) all set max_fires: 3; after three cycles the boot log emits scheduler: reached max_fires=3 — retired def ×6 and the ensemble is done. No watcher, no manual cleanup.
The escape hatch for forks: a forked def explicitly setting max_fires: 0 lifts the cap (so a long-running production ensemble can fork a bounded test ensemble without inheriting its cap). Reviewed and added in #418.
The exp5 pipeline — how the ensemble actually runs
Here's the topology in one diagram:
SCHEDULER (cron "*/5 * * * *", max_fires:3 each)
├── exp5-collect-hn ──┐
├── exp5-collect-wired ──┤ fire in PARALLEL
├── exp5-collect-engadget ──┤ every 5 minutes
├── exp5-collect-ars ──┤
└── exp5-collect-tc ──┘
│
│ each: Context op=time → cycle
│ HTTP GET feed-url → pick ≤5 AI/agentic items
│ Memory.set scope=user key=digest:<cycle>:<feed>:<n>
│
└── on_complete: channel.publish → exp5-pings (scope: global)
SCHEDULER (cron "1,6,11,…", max_fires:3)
└── exp5-consolidate
│
│ Channel op=await
│ channels=[exp5-pings]
│ mode=at_least n=5 wait_ms=120000
│ → satisfied: true (5/5, n_reached)
│
│ Memory.list prefix="digest:<cycle>:" → 25 items
│ dedup by normalized URL → 23 unique items
│ Memory.set consolidated:<cycle>
│
└── mcp__telegram__send_message → {"ok":true,"message_id":22}
The collectors don't talk to each other. They don't know about the consolidator. They don't even need the Channel tool — the scheduler's on_complete hook does the ping on their behalf, stamping each message with the firing schedule's name (so the consolidator can count distinct collectors by schedule_name, not by publisher identity).
The consolidator doesn't know the collectors' run_ids. It awaits the channel, loads memory by cycle prefix, deduplicates, sends the message. If two collectors failed (host outage, allowlist refusal), satisfied=false, timed_out=true, total_messages=3 — the consolidator sends a partial digest with k/5 sources in the header and the ensemble survives. No retry storm, no half-state in the run table.
Three follow-up fixes — what stress-testing the ensemble surfaced
The first complete exp5 run landed on v0.25.0 with the three RFC S primitives wired in. It worked, with three rough edges. Each one was a real gap the experiment was the first thing to exercise, and each one shipped in the next release within hours.
F37 — on_complete channel.publish honors declared scope (#422, v0.25.1, RFC T)
F37 · scope mismatch on the scheduler hookThe collectors ran under user_id: exp5, so the scheduler's on_complete: channel.publish hook published the ping under user/exp5 — even though the channel was declared scope: global. The consolidator's Channel.await on the global channel read 0 messages. The collectors had succeeded; the consolidator timed out.
Root cause: the scheduler's dispatch path inherited the run's user scope into the publish call, ignoring the channel's declared scope. The v0.25.0 workaround was to declare the fan-in channel scope: user, which sort-of worked because all five collectors shared user_id: exp5. But that's wrong — a fan-in channel is shared, the operator declared it global for a reason, and the hook is operator-authored config (not a model-authored choice that needs principal-overriding discipline).
The fix: a new ResolveChannelScope resolver in internal/api/http, wired into the scheduler so the publish hook resolves the channel's declared scope and uses that, not the run's scope. The same resolver also catches the static-vs-dynamic channel mismatch the existing channel CRUD code already handled correctly. exp5's global fan-in channel now works as authored.
F38 — scheduled runs resolve their agent in the def's tenant (#424, v0.25.2, RFC U)
F38 · scheduler tenant resolutionThe fully-dynamic exp5 variant (every entity created at runtime via REST) had each scheduled collector fire fail with unknown agent: exp5-collector. The scheduler's fire path was resolving agents from the yaml-static registry only, not the AgentDef substrate store the dynamic variant lived in.
Same shape as the F30 webhook-spawn fix from exp4 — a runtime entity (webhook there, schedule here) needed to resolve a runtime agent under a consistent tenant, and the resolution path wasn't following the AgentDef store. The fix: scheduled runs now stamp the def's tenant onto the spawn, and the agent resolves against both static and dynamic registries.
With #424 in place, the dynamic exp5 collectors ran cleanly — the scheduler-fired runs spawned, executed, pinged, and the consolidator's fan-in landed at 5/5.
F39 — dynamic stdio MCP env interpolation (#426, v0.25.3, RFC V)
F39 · env not interpolated at spawn timeThe fully-dynamic Telegram leg failed: mcp__telegram__send_message returned HTTP 404 because the bot received the literal string ${LOOMCYCLE_TELEGRAM_BOT_TOKEN} as its API token instead of the resolved value. Dynamic stdio MCPServerDef env values weren't being interpolated against the runtime's env at spawn time, only at YAML-load (which the dynamic path bypasses by design).
The same envelope-of-references pattern we hit in MCPServerDef header expansion back at v0.18 — there for HTTP MCP headers, here for stdio MCP env. The fix: expand ${ENV} references in the dynamic stdio MCP def's env map at spawn time, using the same prefix-allowlisted resolver the rest of the substrate uses. The Telegram bot now receives the resolved token; send_message returns {"ok": true, "message_id": 22} on the dynamic variant with no workaround.
Three fixes, three releases, 24 hours. v0.25.1 closed F37, v0.25.2 closed F38, v0.25.3 closed F39. Each had a regression test that fails on the prior tag — no "we shipped a fix, did it work?" ambiguity. The dynamic exp5 variant is the soundness test: it ran end-to-end on v0.25.3 with zero workarounds.
Companion change — every experiment now ships as a self-contained example
The five experiments aren't just sandbox reports anymore. Every prior experiment now lives as a self-contained directory under loomcycle/examples/, with everything an operator needs to clone and exercise the substrate in minutes:
| Example | What it exercises | External deps |
|---|---|---|
exp1-tools-usage/ | Built-in tools (Read / Write / Edit / Bash / Web): a coding agent writes, runs, and verifies a program in a sandbox. | loomcycle + a provider |
exp2-interruption/ | Interruption (human-in-the-loop): Yes/No gating with REST-resolved branches. | loomcycle + a provider |
exp3-multiagent-loop/ | Channel + Memory + Evaluation + Context: a 3-agent, 5-hop refine/evaluate loop. | loomcycle + a provider |
exp4-gitea-telegram/ | Inbound webhooks + 3rd-party MCP + Telegram: coder → PR → reviewer-merge → advisor → Telegram. | Gitea, Telegram, gitea-mcp binary |
exp5-agent-ensemble/ | (joining shortly) Scheduler-driven news-digest ensemble with Channel.await fan-in. | RSS feed allowlist + Telegram |
Each directory carries its own loomcycle.yaml, a run.sh launcher that boots the substrate from ./work, a .env.local.example template (empty values; first run.sh copies it to .env.local for you to fill in), and a README.md with step-by-step reproduction + verification instructions. Every example routes Anthropic OAuth primary → DeepSeek v4-pro fallback via tier: middle, so an operator with either credential exercises the same agent.
The point of shipping the experiments as examples: the sandbox reports are how we test loomcycle from the outside; the examples are how operators learn from the same code path. Same primitives, same configs, same wire surface — minus the experiment-specific narration, plus the documentation and the reproducibility scaffolding.
The lesson worth keeping
A multi-agent system is not an ensemble. An ensemble needs primitives that survive the loss of any single agent's run. Agent.parallel_spawn ties N children's lifecycles to one parent. Channel.await ties N agents' coordination to a substrate channel that lives independently. The shift from "the parent waits for the children" to "the substrate is the connective tissue" is what makes the ensemble shape work — and it's what unlocks the scheduler-driven, cron-fireable, self-terminating, partial-tolerant pipelines real production agentic systems need.
What this means for the runtime going forward: parallel agent ensembles are now the design center, not a sub-case of multi-agent. The substrate primitives are sized for it. The examples directory shows it. The next experiments (long-running async, ensemble-of-ensembles, RAG-backed) build on this base.
Companion reading from the operator-via-MCP series: exp1 + exp2 — tool access and interruption · exp3 side analysis — the MCP wedge · exp3 main — the multi-agent refine loop · exp4 — Gitea + Telegram + secret redaction. Plus the upstream design: doc-internal/rfcs/agent-ensemble-primitives.md for the locked RFC S decisions.