Skip to main content
loomcycle
§ release note

Loomcycle speaks A2A — server, client, and the INPUT_REQUIRED bridge that wasn't supposed to ship.

A week ago we wrote that the right way for loomcycle to be reachable from the Microsoft Agent Framework and Google ADK enterprise stacks was the Agent2Agent (A2A) protocol — Google's open standard, donated to the Linux Foundation, current stable v1.0.1. We named it as the lowest-risk interop bet on the table and locked an RFC (G) the same week. This week it shipped. Loomcycle now speaks A2A on both sides: a served AgentCard at the well-known URI, three protocol bindings, signed cards, multi-tenant routing, and synthetic peer tools. And — honest about the locked RFC — one explicit deferral got reversed during implementation, because the substrate turned out to already be there.

What landed

Eight slices on a feature branch, built against a2a-go v2.3.1 — the official Go SDK. The RFC committed to adopting the SDK as a direct dependency rather than re-implementing ~10K LOC of protocol code, and that bet paid off twice: we got three wire bindings (REST, JSON-RPC, gRPC) for free, and the SDK's real API forced honest implementation around iterator semantics and per-request context lifecycles that the RFC sketch had glossed over.

The server surface is off by default. Operators turn it on with three env vars:

LOOMCYCLE_A2A_ENABLED=1
LOOMCYCLE_A2A_SERVER_CARD=loomcycle-fleet
LOOMCYCLE_A2A_PUBLIC_BASE_URL=https://agents.example
LOOMCYCLE_A2A_TENANCY_ROUTING=none      # none | host | path

Once enabled, loomcycle mounts additively — nothing about /v1/*, MCP, or /ui changes:

PathBinding
/.well-known/agent-card.jsonAgentCard discovery URI (unauthenticated)
/a2a/v1 + subpathsREST (HTTP+JSON)
/a2a/jsonrpcJSON-RPC 2.0
/a2a/grpcgRPC, on loomcycle's existing gRPC server (HTTP/2 needs no path mount)

The card advertises all three interfaces; an A2A client picks whichever transport it prefers (the reference SDK defaults to JSON-RPC). Each exposed loomcycle agent becomes one AgentCard skill; inbound messages carry a skillId in metadata that routes to the mapped agent. A request that names an unknown or unexposed skill is rejected as FAILED — a peer cannot reach an agent that isn't in exposed_agents by guessing the skill id.

Two new substrate primitives carry the configuration, content-addressed and versioned like every other substrate Def (AgentDef / SkillDef / MCPServerDef / ScheduleDef):

The deferral we reversed

The locked RFC's Decision 9 was emphatic: "TASK_STATE_INPUT_REQUIRED and TASK_STATE_AUTH_REQUIRED are out of scope for v1." The reasoning was that run-loop state machine changes for interactive task states would be 1–2 PRs of their own — honest scoping, defer to v2.

Then implementation hit the question and produced a different answer. Loomcycle's Interruption tool (shipped v0.8.16) is already the human-in-the-loop primitive: a run that calls Interruption.ask parks on a channel-bus wait, surfaces the question to the operator surface (Web UI, MCP, CLI), and resumes the same run with the operator's answer. That's exactly what A2A's INPUT_REQUIRED state is asking for — a peer-facing version of "this task is waiting for more input."

So slice A2A-6 shipped the bridge instead of the deferral. A loomcycle run that parks on Interruption.ask surfaces TASK_STATE_INPUT_REQUIRED over A2A; a follow-up message on the same taskId resolves the interruption via the same notification bus the Interruption tool already waits on, and the same run resumes to its real terminal state. HTTP-side resume and A2A-side resume converge on one mechanism. A follow-up on an unknown taskId starts a fresh run instead, so the peer-facing surface stays well-defined when state has been forgotten.

AUTH_REQUIRED stays deferred — loomcycle expects peer credentials supplied up front through the per-run credentials map, not negotiated interactively. The honest scope split here is "we did the half we already had substrate for."

The bug the unit tests missed

Whole-feature code review caught a real correctness defect that explains why integration tests against the real SDK matter. The pattern: the a2a-go SDK creates a per-streaming-request context with requestCtx, cancel := WithCancel(ctx); defer cancel() inside its REST and JSON-RPC handlers, and hands that ctx to Execute. Our first cut of startRun drove runner.RunOnce with the request context — so the instant our iterator parked and yielded to wait for INPUT_REQUIRED, the SDK's deferred cancel() tore the run down. The Interruption tool's Bus.Wait honors ctx.Done(), which meant the parked run died with context.Canceled before any resume could arrive.

Unit tests missed it because they passed an uncancelled context.Background() rather than the per-request ctx the real SDK uses. The fix was to detach the run lifetime from the request lifetime via context.WithoutCancel + an executor-owned cancel — cancellation now flows only through Executor.Cancel (the Connector's existing cascade pathway) and explicit stream.cancel on abandon or unresumable park. A regression test (TestExecutor_RunContextDetachedFromRequestContext) pins the invariant.

Two other bugs surfaced from the end-to-end integration test that drives the real a2aclient against the mounted server: the REST binding mounted at /a2a/v1/ without prefix-strip, so the SDK's relative routes (/message:send, /tasks/{id}) all 404'd; and the executor emitted a bare status update first on a rejected brand-new message, but the SDK aggregation requires the first event to be a Task or Message, which surfaced rejections as opaque transport errors instead of terminal FAILED. Both got regression-test pins on the way out.

Signed cards, honest about what they prove

Loomcycle signs its served card over RFC 8785 JSON Canonicalization Scheme with ES256 (JWS detached), when the operator names a signing-key env var that's on loomcycle's existing allowlist (the same one the scheduler and per-run credentials use). The signature embeds the matching public key as a self-contained JWS so a verifier needs no separate key fetch.

Signing is best-effort and never fails card serving. No key configured, a non-allowlisted env name, an unset var, or a malformed key — all serve the card unsigned with a single trace line for operator triage (the key value is never logged). The allowlist is a real floor: a substrate-authored A2AServerCardDef cannot name an arbitrary env var and exfiltrate it into a signature.

Inbound peer-card verification is the symmetric story but with a deliberately honest caveat:

What verify_signed_card: true proves, and what it doesn't. The JWS is self-contained — it verifies against a public key in the card's own protected header, with no external trust anchor. So a verified signature proves the card's integrity (not altered after signing), but not the peer's identity. Peer identity rests on TLS: the card is fetched over HTTPS from the peer's own well-known URI, so the transport authenticates the origin. Treat verify_signed_card as tamper-evidence on top of TLS, not as a replacement for it. Pinning a peer's key out-of-band is a future enhancement, not a current capability.

What's deliberately not yet wired

Two pieces of advertised functionality are accepted in config but explicitly not enforceable today, named loudly so operators don't trust them by accident:

One latent caveat worth recording: the JCS canonicalization implementation isn't fully RFC 8785-faithful for fractional or very large numbers. AgentCards in practice carry only integers, so the gap is latent rather than active, but it's tracked.

The most useful engineering insight from shipping this is the substrate-credit story. We locked the RFC with INPUT_REQUIRED deferred to v2 because we believed interactive state would mean new run-loop machinery. Implementation revealed that the Interruption tool — shipped 18 months ago for human-in-the-loop on the operator side — already was the right substrate, with the bus, the resume seam, and the take-once mechanics we'd need. Sometimes the right scope cut isn't "build less now"; it's "look harder at what's already there." We'd defer less if we credited our own primitives more honestly at lock time.

What you can do with it today

Expose a loomcycle agent:

# loomcycle.yaml
a2a_server_cards:
  loomcycle-fleet:
    name: loomcycle-fleet
    capabilities: { streaming: true }
    exposed_agents:
      - { agent_name: company-researcher, skill_id: research, skill_name: Research }
$ LOOMCYCLE_A2A_ENABLED=1 \
  LOOMCYCLE_A2A_SERVER_CARD=loomcycle-fleet \
  LOOMCYCLE_A2A_PUBLIC_BASE_URL=https://agents.example \
  ./bin/loomcycle --config loomcycle.yaml

An external A2A client — Microsoft Agent Framework, Google ADK, LangGraph's /a2a/{assistant_id} endpoint, or any conformant peer — fetches https://agents.example/.well-known/agent-card.json, sees the research skill, sends a message with Message.Metadata["skillId"] = "research", and gets the run streamed back as A2A Task events ending in COMPLETED.

Call a remote peer from a loomcycle agent:

# loomcycle.yaml
a2a_agents:
  partner-research:
    agent_card_url: https://partner.example/.well-known/agent-card.json
    auth: { scheme: http, bearer_credential_ref: partner_token }
    expected_skills:
      - { id: research, required: true }

agents:
  orchestrator:
    allowed_tools: [Read, a2a__partner-research__research]

The orchestrator agent can now call a2a__partner-research__research; loomcycle fetches the peer card (verifying the signature if configured), resolves the partner_token bearer from the run's UserCredentials via the existing RFC F channel, sends the message, and returns the peer's answer to the model.

Full operator docs in Context.help a2a-integration on any loomcycle build past this week.

Companion reading: Seven frameworks and the row that's missing (the framework-survey landscape post that named A2A as the lowest-risk interop bet), Three MCP tokens in one run (RFC F, the per-run credentials channel that carries A2A peer bearers), and Scheduled runs at 30,000 fires (RFC E, the substrate-primitive pattern A2A's two new Defs inherit from).