§ release note

Self-evolving agents. Genes that drive real temperature, and an experiment that snapshots mid-run and resumes on another instance.

2026-06-11 · by Dennis Gubsky · ~12 min read · updated 2026-06-16 with exp6.8 (the GA on local Ollama models)

The sixth experiment in the operator-via-MCP series turned the substrate inward. A meta-agent ran a genetic algorithm over a population of solver agents that rewrote their own system prompts based on an advisor's evaluation, generation by generation, until one crossed a fitness threshold. That worked on v0.25.2 (static) and v0.26.2 (fully-dynamic, after the F40 fix that let a runtime-authored meta-agent fork). But the evolution was prompt-only: the gene values were baked into the prompt text, and the model's actual sampling didn't move when the genes did. And while the experiment could be paused and snapshotted, "snapshot anytime" was a hopeful suggestion - only quiescent boundaries were genuinely safe.

Three releases later, the experiment is clean. exp6.5 closes both gaps across v0.28.0 and v0.30.0:

Per-agent model tunings (v0.28.0, #447) make AgentDef.sampling a real fork-overlay field. The creativity gene now sets sampling.temperature = round(creativity/10, 2): 0.0 focused, 1.0 wild. A gene mutation actually changes the model's sampling, not just the prompt's literal text. Real evolution, not pretend.
Cooperative pause (v0.28.0, #446, F41) parks in-flight sub-runs at iteration boundaries, gates new POST /v1/runs with HTTP 503 during the quiesce window, and emits a precise warning when a fan-out parent can't yet park. Reliable boundary snapshots work.
Cross-instance resume of mid-run snapshots (v0.30.0, #456, F42) - the killer demo. ResumePausedRuns reconstructs a paused run's loop from its restored transcript and re-enters loop.Run under the same run_id. Fired after a snapshot restore and at boot (crash recovery). Pause/snapshot truly anytime.

The headline result: a breeder snapshotted mid-mutate (91 transcript events captured, gen-2 half-seeded), wiped DB, restored on a fresh loomcycle instance - and the re-dispatched breeder finished its work autonomously, with no external driver. Then the experiment continued to completion. Five generations, mean score 0.763 → 0.865, winner promoted at best_score: 0.91. Cross-instance lineage proven across the mid-run boundary.

Worth saying up front: this is still the experiment in the series that needed no new substrate primitives. Forkable AgentDefs, parent-pointer lineage, parallel fan-out, an evaluation primitive, shared user-scope memory, pause/snapshot/restore - all of these had shipped over the past several months for other reasons. exp6.5 didn't need a single new wire-shape primitive. What it needed were three closures that prior work had set up but not completed. All three landed in five days.

The design - persona as genotype, gene → real model knob

A "self-evolving agent" here is one that rewrites its own system prompt to score better, and whose improved prompt is inherited by its offspring. The unit of selection is the AgentDef version. The unit of inheritance is the parent_def_id lineage chain. The unit of variation is whatever changes between a parent's def and its forked child's.

Three integer genes (0-10), each affecting the agent in a different way:

Gene	Low (0-3)	High (7-10)	Runtime effect
creativity	literal, conventional	bold, lateral, vivid leaps	`sampling.temperature = creativity/10` (real model knob, v0.28.0) + `effort` tier
courage	hedge, qualify	commit decisively, no hedging	prompt text only (baked into `system_prompt`)
caution	confident, minimal caveats	self-critical, piles on caveats	prompt text only (baked into `system_prompt`)

The advisor's rubric implicitly rewards creativity↑, courage↑, caution↓. The population has to drift there from deliberately sub-optimal gen-0 seeds.

The v0.28.0 change is what re-shapes this from a demo into a real evolutionary substrate. Pre-v0.28.0, loomcycle exposed no per-agent sampling controls - temperature and top_p were tier-level only. So "creativity" was prompt text and an effort knob, but the model's literal sampling didn't change when the gene moved. With #447, every forked solver variant carries its own sampling.temperature on its def. AgentDef.fork(overlay={system_prompt: …, sampling: {temperature: 0.7}}) mints a child def whose temperature genuinely differs from its parent's. The solver can read its own resolved temperature via Context op=self (which now also reports sampling, per #448) and write it into its self-report.

Observed end-to-end in the v0.28.0 re-run: gen-0 creativity 3.0 → temperature 0.3; winner creativity 10 → temperature 1.0. The forked def for gen1:var1 persists sampling.temperature: 0.6; the winner def's sampling.temperature: 1.0. After snapshot → wipe → restore on a fresh instance, those exact values come back - the sampling knob survived the file round-trip intact. The temperature gene survived the v0.30.0 mid-run restore too (the v0.30.0 winner has sampling.temperature: 0.8).

Topology - spawn-based, no channels, no scheduler

Three agents, one driver per generation, lineage carried by the substrate:

            POST /v1/runs  (driver: one breeder run per generation g)
                    │
              exp6-breeder  (depth 0 - GA controller)
        gen 0 ─ author task (advisor) → Memory task:spec
                seed 4 diverse variants via AgentDef.fork
                (each fork carries sampling.temperature from its gene)
        ┌──────────────────────────────────────────────────────────┐
   each generation g:
     1. SOLVE   Agent.parallel_spawn → 4 × exp6-solver (depth 1),
                pinned by def_id, prompt = task + g/i.
                each solver: Context.self → run_id, sampling.temperature
                              Memory.set gen:g:var:i:result
     2. SCORE   spawn exp6-advisor (depth 1): reads each result,
                Evaluation.submit run_id score∈[0,1]
                dimensions={novelty, decisiveness, correctness}
                writes gen:g:var:i:eval
     3. SELECT  breeder reads the evals →
                gen:g:summary {scores, mean, best_var, best_def_id}
     4. STOP?   best ≥ THRESHOLD OR g == MAX_GEN-1 →
                AgentDef.promote(best) + result:summary → STOP
     5. MUTATE  elitism keeps the best; each other child = survivor
                spawned in REFLECT mode (proposes its own child
                genes, Δ ≤ ±3) → AgentDef.fork(
                  parent_def_id=survivor,
                  overlay={system_prompt:<rebuilt>,
                           sampling:{temperature: genes.creativity/10},
                           effort:…})
                → gen:g+1:var:j
        └──────────────────────────────────────────────────────────┘

Constants: POP=4, MAX_GEN=5, user_id=exp65. The external driver (work/exp6_5_run.sh) is a thin generation-stepper - it loops g, blocks on each breeder run over SSE, and reads the memory ledger to detect the stop signal. All mutation, scoring, and forking are done by agents. The "self-evolving" content (each survivor proposing its own gene tweak) is agent-authored; the breeder applies it as a fork.

The three role agents

All three are runtime-authored via POST /v1/_agentdef in the fully-dynamic variant (the breeder needs the F40 fix to fork; before v0.26.2 that didn't work, see below). Memory is shared via scope: user, user_id: exp65.

exp6-solver - the evolving lineage. allowed_tools: [Context, Memory] (permanent ceiling; fork can only narrow). Two prompt-driven modes: SOLVE (answer in-character per the persona baked into the prompt) and REFLECT (given its score + advisor feedback, propose child genes as JSON, Δ ≤ ±3). Each forked variant carries its own sampling.temperature.
exp6-advisor - task-giver + fitness judge. allowed_tools: [Evaluation, Memory, Context], evaluation_scopes: [submit_any, read_any]. Authors the task + rubric (rewarding novelty + decisiveness + correctness). Judges the output, never the genes.
exp6-breeder - GA controller, the meta-agent. allowed_tools: [Agent, AgentDef, Evaluation, Memory, Context], agent_def_scopes: [named:exp6-solver] (the capability gate that lets it fork the solver lineage and only the solver lineage), evaluation_scopes: [read_any], max_concurrent_children: 6.

F40 - the fix that let a runtime-authored meta-agent fork (v0.26.2)

The loomcycle Web UI's Edit (fork) agent modal. Visible fields: name (code-guru), description, provider (anthropic), model (claude-opus-4-7), tier (standard), effort (medium), system prompt textarea, allowed_tools (Read, Write, WebFetch, Bash, WebSearch, Memory, Channel, Context, Agent, AgentDef), skills (briefing-format, citation-style), max_tokens, max_iterations, memory_quota_bytes, memory_scopes (agent + user). The modal is rendered over the Library view showing 27 agents in the left rail. — The Edit (fork) agent modal in the Library view. Every field shown here is content-addressed and overlay-aware - `AgentDef.fork(overlay={…})` mints a child def whose fields differ from its parent's, with the rest inherited. The F40 fix made the `*_def_scopes` capability family round-trip through this seam too, which is what unblocked a runtime-authored meta-agent (like exp6-breeder) from forking another agent's lineage.

The original exp6's fully-dynamic variant was blocked on v0.25.2. The dynamic exp6-advisor worked (its evaluation_scopes round-tripped - F14 holds). The dynamic exp6-breeder was created fine with allowed_tools and evaluation_scopes intact. But agent_def_scopes: null. Every AgentDef op returned is_error: "agent has no agent_def_scopes (default-deny)".

Root cause: the AgentDef create/fork overlay's mergedDef struct round-tripped channels, evaluation_scopes, interruption (the F14 closure) but had no field for agent_def_scopes nor the rest of the *_def_scopes capability family. A runtime-authored meta-agent was impossible.

Fix in v0.26.2 (#436, RFC W): round-trip the five *_def_scopes through mergedDef + applyOverlay + lookup.SubstrateAgentDef. Deliberately not part of content_sha256 - ACLs are authority, not content. The fix persists through snapshot too: the breeder's agent_def_scopes survives the file, so a restored breeder can still fork. exp6.5 relies on this every time it crosses an instance boundary.

exp6.5 - pause, snapshot, continue on another instance

Once the substrate fully participates in the evolution, an obvious question becomes: can the whole experiment be checkpointed? Long-running experiments are interrupted (the host reboots, the operator goes home, a different developer wants to take over). loomcycle's pause + snapshot + restore have been on the wire since v0.8.17. exp6.5 is the first test that subjects a real multi-agent, multi-generation experiment to a full lifecycle.

 INSTANCE #1  (3 role agents authored at runtime · data dir A)
   gen 0  ── full breeder run ────────────────────────────────────────┐
   LEG 1: gen 1 breeder spawned in background; ~35s in:                 │
        loomcycle pause → resume ; gen 1 then completes normally       │
   LEG 2: at the gen-0/gen-1 boundary (experiment quiescent):            │
        pause → snapshot → export ────────────────────► exp6_5.json (76 KB)
   STOP instance #1  +  rm -rf data dir A   ("clear the database")     │
                                                                        ▼
 INSTANCE #2  (SAME config · FRESH EMPTY data dir)
   loomcycle restore exp6_5.json
   gen 1, 2, 3, 4 ── driver continues against the RESTORED substrate ──► complete

Leg 1 - mid-run pause/resume on the same instance

A POST /v1/_pause issued ~35 seconds into the gen-1 breeder run did not corrupt or lose it. POST /v1/_resume woke it; gen-1 completed; the experiment carried on. Continuity preserved.

The v0.26.2 run surfaced one behavioral wart while doing this:

F41 · v0.26.2 · pause is a soft quiesceThe pause call returned immediately (duration_ms: 0) with paused_runs_count: 0. A run blocked inside Agent.parallel_spawn never reached the iteration boundary that would set pause_state='paused'. New spawns weren't 503'd.

Continuity was unaffected (the run completed across pause/resume), but "snapshot anytime" was only reliable at a quiescent boundary - which is why Leg 2 snapshots between generations rather than mid-fan-out.

F41 fixed · v0.28.0 · #446 RFC X Phase 1Cooperative pause: in-flight sub-runs park at iteration boundaries; new POST /v1/runs during the pause window are gated with HTTP 503; a Phase-2 warning is emitted when a fan-out parent can't yet park.

Re-tested on v0.28.0: pausing ~35 s into a breeder generation returned {state: paused, duration_ms: 30002, paused_runs_count: 4} - the four solver sub-runs parked, Pause() waited the full timeout, a concurrent POST /v1/runs got HTTP 503. The runtime also emitted the precise Phase-2 warning when a fan-out parent can't yet park (the breeder during parallel_spawn).

Leg 2 - snapshot → fresh instance → restore → continue → complete

The portable file: 76,667 bytes, schema_version: 1. Contents: 13 agent_defs (the breeder, the advisor, and the 11-version solver lineage), 3 agent_def_active rows, 31 memory keys, 7 evaluations. Secret-free by audit: no resolved API-key / bearer / token values appear anywhere - secrets live in the process environment (per loomcycle's longstanding "store names, never values" discipline, made structural by v0.23.4's value-based redaction).

Fresh instance #2, booted on an empty database. After loomcycle restore: {agent_defs_restored: 13, agent_def_active_restored: 3, memory_restored: 31, evaluations_restored: 7}. The solver was back at v11 with its active pointer intact, the ledger present, and - critically - the breeder's agent_def_scopes survived the restore (the F40 fix persists through the file). The driver continued from where it left off; genes converged across the instance boundary (creativity 3.0 → 5.8, courage 3.5 → 8.0, caution 7.0 → 2.2). The v0.28.0 re-run reached {generations: 5, best_score: 0.93, winner: exp6-solver v8, stopped: max_gen}.

Cross-instance lineage - the killer check

A gen-2 variant produced on instance #2 (post-restore) has parent_def_id = def_f5bdf90f…. That def resolves on instance #2 to exp6-solver v7 - a def that was created on instance #1 and exists on #2 only because it was restored from the file. The genetic lineage chain is unbroken across the DB-wipe and cross-instance restore. A handed-off experiment isn't just a copy; it's a continuation.

F42 - the v0.30.0 fix that makes "snapshot anytime" real

The v0.28.0 Leg-2 worked because we snapshotted at the gen-0/gen-1 boundary - the breeder wasn't mid-anything. A more honest test is: can you snapshot the breeder while it's in the middle of mutating, hand the file to a fresh instance, and have the experiment continue from where the half-finished generation left off?

On v0.28.0, the answer was no:

F42 · v0.28.0 · in-flight runs don't migrateSnapshotting mid-fan-out captured the breeder as a paused_runs row with its full transcript, but the restored paused run was not re-dispatched on the fresh instance - resume returned 409 not_paused, and a process restart didn't relaunch it either. Its half-finished generation stalled. The experiment's overall continuation still worked because the driver could spawn fresh runs against the restored substrate, but an individual in-flight run did not migrate.

The substrate moved across the boundary; the running process didn't. The honest discipline at v0.28.0 was: snapshot at a quiescent boundary, not mid-fan-out. Acceptable for many workflows, but a real residual.

F42 fixed · v0.30.0 · #456 RFC X Phase 2Cross-instance resume of a snapshotted mid-run. ResumePausedRuns reconstructs a paused run's loop from its restored transcript and re-enters loop.Run under the same run_id - fired after a snapshot restore (response reports paused_runs_resumed) and at boot (crash recovery). Pause/snapshot truly anytime.

The killer demo - snapshot mid-mutate, restore, autonomously continue

Targeting exactly the v0.28.0 failure mode: the breeder mid-MUTATE phase, with gen:1:summary already written but only gen:2:var:0 seeded. Three of its four planned forks for gen 2 hadn't happened yet. Pause; snapshot; observe.

The snapshot captured paused_runs: 1, agent: exp6-breeder, 91 transcript events - the breeder mid-fork, gen 2 half-seeded. Wipe the DB. Boot a fresh loomcycle instance on the empty data directory. Restore the file:

$ loomcycle restore exp6_5_midrun.snapshot.json
{
  "agent_defs_restored": 10,
  "agent_def_active_restored": 3,
  "memory_restored": 18,
  "evaluations_restored": 4,
  "paused_runs_restored": 1,
  "paused_runs_resumed": 1
}

[boot log]
resume: re-dispatched 1 paused run(s); 0 skipped/flagged

The re-dispatched breeder finished its work autonomously - no driver. Right after restore, gen 2 had only var0; with nothing else running, the resumed breeder seeded var1-3 on its own, completing the very mutate it was parked in the middle of. The external driver wasn't even attached. The substrate played back the breeder's transcript, found the unfinished fork loop, and finished it.

The driver then re-attached and continued from gen 2. Final result:

result:summary = {
  generations: 5,
  best_score:  0.91,
  winner_def_id: def_e7a3…   // exp6-solver v2 (from the gen-2 wave forged on instance #2)
  stopped: max_gen
}

mean score climbed:    0.763 → 0.865 across the boundary
genes converged:       creativity 3.0 → 8.8
                       courage    3.5 → 8.2
                       caution    7.0 → 4.5
winner sampling.temperature: 0.8  (the temperature gene survived the mid-run restore)

Cross-instance lineage check, even sharper this time: a gen-2 variant that was forged on the fresh instance by the re-dispatched breeder has parent_def_id = exp6-solver v8 - a gen-1 def that exists on instance #2 only because it was restored from the file. The genetic lineage chain crossed not just the instance boundary, but the mid-run instance boundary. The new gen-2 variants weren't just runs the driver spawned afterward; they were the breeder's own work, picked up from where it had been parked, and committed back to the substrate of the new machine.

exp6.8 - the same GA on a local Ollama solver population (2026-06-16)

Five days after the 133-minute slow-local-model stress test (the v0.34.3 → v0.37.0 robustness work documented in 133 minutes on a local Qwen), I took the exp6 genetic algorithm and reran it on local Ollama models. The point was not "does the substrate work with local models" (the 133-minute run already answered that). The point was: what does a multi-generation evolutionary loop look like when the population is small local models, and where does the wall come from when it does come?

The model split: gemma4:max solver population, cloud sonnet meta-agents

Three roles, two deployment shapes:

Role	Model	Where
`exp6-solver` ×4 (the evolving population)	`gemma4:max`	LOCAL via `ollama-local`
`exp6-advisor` (grounded judge)	`claude-sonnet-4-6`	cloud (OAuth or API)
`exp6-breeder` (GA controller)	`claude-sonnet-4-6`	cloud (OAuth or API)

A local model as the breeder does not work. I tried qwen3.6:max as the breeder first; it mis-formatted the nested Agent.parallel_spawn argument (passed the spawns array as a JSON string) and terminated its turn early. The 80-step GA orchestration is beyond a local small model's reliability ceiling for structured tool calls. Keeping both meta-agents on cloud also avoids the secondary problem that surfaced in the same attempt: ollama swapping two large models (qwen ↔ gemma) in VRAM mid-loop was corrupting context and dropping solver self-reports. One local model in the population at a time, cloud handles the meta layer.

The temperature cap

gemma4:max hallucinates above ~0.8 temperature. The creativity gene maps to a capped real temperature round(min(creativity/10, 0.7), 2), so every variant stays in the grounded band (≤0.7, below the cliff) and produces scoreable answers. The earlier uncapped 0.0 → 1.0 run hallucinated too much to converge. The advisor's rubric rewards vivid + decisive + factually grounded, so hallucinated specifics score low. The cap is a per-population calibration, not a substrate change. (The substrate already lets you cap a sampling gene at the AgentDef overlay; the cap is in the breeder's prompt.)

The honest finding: the GA completes, but the population mean stays flat

Five generations ran cleanly. Lineage and promote intact. Genes drift toward the rubric optimum (creativity 4.8 → 9.0, the advisor rewards higher creativity inside the capped band). Best output climbs 0.82 → 0.87. But the mean stays flat.

gen | mean | max  | mean creativity
 0  | 0.75 | 0.82 | 4.8
 4  | 0.54 | 0.87 | 9.0   ← winner promoted; stopped at MAX_GEN

The mean DROPS over five generations even though the max climbs. Why: ~35% of variant-slots produce no usable score. gemma4:max silently skips the structured self-report ~20% of the time (the agent finishes a generation without emitting the expected {score, rationale, gene_proposal} JSON block) and hallucinates ~15% (specifics the advisor's grounded rubric scores zero on). Those are 0.0 scores in the ledger. The winner is a genuine artifact; the population is not converging because most of the population isn't reporting.

The limiting factor for local agentic evolution is the small model's per-run reliability, not the loomcycle substrate. Every substrate primitive performed as designed: AgentDef.fork + promote + lineage round-tripped through the substrate, Agent.parallel_spawn with N=4 dispatched cleanly across generations, Evaluation scored every reported result, Memory held the gen:* ledger across the full run, and the per-agent sampling.temperature overlay reached the ollama model on every fork. The v0.37 local-model robustness (heartbeat ticker + compaction tail-cap + 300s local timeouts) kept the long, slow, multi-generation run alive without false-timeout deaths.

The substrate is ready for local agentic evolution. The 4B-7B class of local model is not - not yet, not at the structured-tool-call reliability bar a multi-step agentic loop demands. exp6.8 is the experiment that surfaces the ceiling honestly, with the substrate as the constant.

What it takes to push past the wall

Two clean paths, both add cost or change the dependency shape rather than the loomcycle code:

N-of-M retry per solver. If a variant returns no usable score (the silent skip or the hallucinated specifics), re-spawn it once. The breeder agent can do this in-prompt; no substrate change needed. Cost: ~35% more solver calls per generation in the worst case.
A larger or more reliable local solver model. A 14B-30B class local model (Qwen3.6:27b, the model that gave the 133-minute run, or a local Llama 3.3 70B) hits the structured-tool-call reliability bar much more consistently. Cost: reintroduces the VRAM model-swap question - if your meta agents are cloud-sonnet you avoid it, but a larger solver model alone takes the VRAM budget that the previous local-meta attempt also wanted.

RFC candidate from this experiment: per-model num_ctx is currently global-only (LOOMCYCLE_OLLAMA_LOCAL_NUM_CTX). You can't give gemma4 a small window and another local model a larger window in the same loomcycle instance. The fix is a per-model num_ctx override (substrate-side, opt-in, gauge-aware). Not blocking exp6.8; surfaced by it.

The exp6.8 directory ships as loomcycle/examples/exp6.8-local-evolution/ alongside the original exp6. Same driver, same verifier, different routing. Clone, set OLLAMA_BASE_URL and a sonnet provider in .env.local, run ./run.sh and ./work/exp6_run.sh evolve in another terminal.

The engineering lessons

An evolutionary substrate needs more than forkable defs. It needs gene-to-runtime mappings that go past prompt text. Pre-v0.28.0, "creativity" was a number baked into a string. The model didn't see it; only the prompt did. Once sampling.temperature rides the AgentDef overlay, the gene reaches the model - the substrate is now expressive enough for the evolution to actually mean something at the inference layer.

An experiment that fits in a 76 KB file - and resumes from any pause point - is a different kind of artifact than one that lives in a process. Loomcycle's substrate-shaped design (every entity is a content-addressed Def, every piece of state has a stable JSON shape, secrets live by reference not value) made the snapshot file a natural unit of work. The v0.30.0 mid-run resume made the unit of work survive arbitrary boundaries. The experiment is the file. The file is replayable. The replay starts wherever the snapshot was taken.

"Pause anytime" is a contract that requires both halves of the runtime to participate. The F41 fix (v0.28.0, Phase 1) made the pause call wait for in-flight runs to park and gated new spawns; the F42 fix (v0.30.0, Phase 2) made paused runs come back to life on the other side of a snapshot. Either alone would have been a soft promise; together they make "checkpoint and continue, anytime, anywhere" a real guarantee.

The substrate can hold open the door for a model class that isn't ready to walk through it yet. exp6.8 ran the same GA on a local gemma4:max population with cloud-sonnet meta-agents. The substrate worked flawlessly across generations - per-agent sampling.temperature reached the local model on every fork, parallel_spawn dispatched cleanly, the v0.37 robustness kept the long slow run alive. But the population mean stayed flat because the small local model dropped ~35% of its structured self-reports. The substrate is ready for local agentic evolution; the 4B-7B class of model is not, not yet. The right reading: this is the substrate as a measuring instrument. The wall is a model-class ceiling, named precisely, with the substrate as the constant.

The one remaining open item (tracked in RFC X, not blocking the experiment): a mid-SOLVE snapshot still parks only the solver children. The fan-out parent (the breeder while it's inside parallel_spawn waiting for its children) doesn't yet park - that needs a separate cooperative-pause point at the parent's parallel_spawn wait, not just at the children's iteration boundaries. Capturing the breeder requires pausing during its MUTATE phase (which the v0.30.0 demo does cleanly) rather than during the SOLVE phase. Orthogonal to the F42 restore-side fix; doesn't block any real workflow, since the breeder's MUTATE phase is where the interesting mid-run state lives anyway. Phase-3 of RFC X.

What this unlocks: the substrate is ensemble-shaped (RFC S, v0.25), meta-agent-capable (RFC W, v0.26.2), sampling-aware per agent (#447, v0.28.0), and now genuinely portable across mid-run instance boundaries (#446 + #456, v0.28.0 + v0.30.0). Long-running production experiments - prompt evolution, A/B routing of competing agent versions, auto-tuning loops, durable workflows that span days - can be checkpointed at any moment, handed to another developer or another machine, and continue from exactly where they were. That's what loomcycle's "the substrate is the artifact" framing has been building toward.

Companion reading from the operator-via-MCP series: exp1+2 - tool access and interruption · exp3 side - the MCP wedge · exp3 main - the multi-agent refine loop (where F14 closed channels/eval/interruption) · exp4 - Gitea + Telegram + secret redaction · exp5 - agent ensembles + RFC S. Plus the upstream design locks: doc-internal/rfcs/meta-agent-def-scopes.md (RFC W) and doc-internal/rfcs/cooperative-pause-and-spawn-gate.md (RFC X).

Reproduce this experiment yourself: exp6 ships as a self-contained directory under loomcycle/examples/exp6-self-evolving-agents/ - own loomcycle.yaml, run.sh, .env.local.example, and a reproducible README. Anthropic-OAuth primary with a DeepSeek fallback. Clone, cd, run ./run.sh.