Skip to main content
loomcycle
§ architecture note

Becoming OpenAI-shaped without becoming OpenAI

Loomcycle grew an OpenAI-shaped front door this week. Three releases:

The shape these endpoints take is the obvious one — if you've configured the OpenAI SDK in your stack, change the base_url, swap the API key, point it at your loomcycle. Done.

The shape these endpoints don't take is the part worth writing about.

Why the native gateway came first

The natural assumption when you hear "loomcycle now does OpenAI Chat Completions" is that we wrote the OpenAI shim and then added a native endpoint on top. The order was the other way around, and the reason matters.

Loomcycle's first job is the agentic loop: tool calls, MCP dispatch, hook chains, per-tenant fairness, prompt-cache control. Those things live behind /v1/runs and friends; they're stateful, multi-turn, opinionated about what the model gets to see. None of that fits inside the OpenAI Chat Completions request shape, which is fundamentally a single-turn primitive over messages and tools.

But the loop has a hot inner step — "call this provider with these messages and tools, get back a response" — that is fundamentally a single-turn primitive over messages and tools. It's the thing every agent loop in the world does twenty times per run. And it turns out that the same security policy you want on the agent loop's inner step — provider routing via resolver, single auth surface (one n8n credential to all providers), retry, host allowlist, per-user quota tracking, audit logging — you also want on any direct LLM call that an upstream tool (LangChain, n8n's Chat Model, a custom workflow node, your own code) makes through loomcycle.

So the native gateway, POST /v1/_llm/chat, came first. It exposes loomcycle's loop-step primitive over HTTP with the loomcycle wire format. The compatibility shims sit in front of it, translating OpenAI's wire format into loomcycle's and back. The internal helper that does the pre-dispatch work is shared:

// internal/api/http/gateway.go

// prepareGatewayDispatch performs validation, resolver
// pinning, semaphore acquisition, and providers.Request
// construction — everything that has to happen before the
// dispatch, and that's identical for the loomcycle-native
// and OpenAI-shaped front doors. The caller serves the
// returned dispatch handle in its own wire format.
func prepareGatewayDispatch(
    ctx context.Context, req GatewayRequest,
) (*gatewayDispatch, error) { … }

handleLLMChat (the native handler) and handleOpenAIChat (the OpenAI-shim handler) both end up parse-then-delegate. Any future bug fix to the pre-dispatch security path lands once, takes effect on both front doors. The chat-completions shim refactor was about twenty lines of net new code; the rest was the prepareGatewayDispatch extraction.

What the shim translates

The OpenAI Chat Completions request shape has a lot of fields. The shim handles the ones that map cleanly to loomcycle primitives and ignores or rejects the ones that don't:

What the shim deliberately drops

Three things the shim does not pretend to support:

Function-call schemas in the legacy functions field. OpenAI deprecated this in favour of tools two years ago. The shim accepts tools, rejects functions with a clear error. There's no value in carrying two ways to do the same thing.

The logprobs response field. Providers that aren't OpenAI mostly don't have a comparable surface; the providers that do have one don't agree on the shape. We'd rather return an honest "not supported" than a lossy translation that quietly silently drops or misrepresents.

Provider-specific OpenAI fields. response_format: { type: "json_object" } is a good example — OpenAI maps it to a specific server-side grammar-constraint mechanism. The closest analogue in Anthropic is a system-prompt instruction; in Ollama it's a grammar file; for OpenAI itself there's the constrained generation mode. The shim passes the field through; what happens with it is provider-dependent. We don't normalize.

What the shim adds (that OpenAI doesn't)

The interesting direction. Even if you're using the OpenAI-shaped front door, you get loomcycle's policy layer for free:

Embeddings: simpler, narrower, just as useful

POST /v1/embeddings is the simpler of the two shims. No resolver path, no tier overlay, no streaming, no tool routing. Just take an array of input strings, dispatch to the single configured providers.Embedder, and return the OpenAI-shaped response.

The interesting bit is who the providers.Embedder is: it's the same instance the loomcycle Memory tool uses internally when you set embed: true on a memory write. The Memory tool went through the same provider family last week (v0.9.0, Vector Memory — semantic search on the Memory tool); the embedder is bound to the runtime once, and both the inbound HTTP shim and the inner-loop Memory tool share it. One model, one provider account, one quota, one audit trail.

Every RAG tool we've tried — LangChain's OpenAIEmbeddings, llama-index's OpenAIEmbedding, the various vector-DB embedders that default to OpenAI — works against the shim by changing only the base URL and the auth token. The cottage industry of "use OpenAI embeddings" tutorials becomes a cottage industry of "use whatever loomcycle is configured to embed with" tutorials, without anyone having to rewrite the tutorials.

The sneaky benefit: consumers can switch embedding backends — Voyage, Cohere, OpenAI text-embedding-3, a self-hosted nomic-embed — by changing one line of loomcycle config, and nothing in the consumer code needs to know. The "OpenAI shape" becomes a stable contract; the actual model becomes an operator decision.

When to use which front door

Pragmatic split:

The same security policy applies to all three. Any future hardening — better quota enforcement, smarter retries, additional audit fields — lands in prepareGatewayDispatch and shows up on every front door simultaneously.

What this unlocks

The honest reason we built the LLM Gateway, in the order we built it, is that the next piece of work needed both shims and the dispatch helper. The next piece is the @loomcycle/n8n-nodes-loomcycle package — a collection of n8n nodes that lets workflow authors put a loomcycle in front of their n8n agents and get the policy layer, multi-tenant fairness, OTel traces, MCP tools, and everything else for free.

n8n's Tools Agent uses LangChain's BaseChatModel under the hood. To plug into it, you implement a Chat Model sub-node — which the n8n package now does, against /v1/_llm/chat. Nothing the workflow author writes needs to change. The model selector, the tool wiring, the system message — all of it stays the way n8n's Tools Agent already expects. The only thing that changes is which gateway is on the other end of the wire.

That post is up next: What it took to make loomcycle a first-class n8n citizen. Three releases of @loomcycle/n8n-nodes-loomcycle in two days; one LangChain @langchain/core/messages/ai.js:178 rejection trail; a defence-in-depth synthetic tool-call-id story that took longer to debug than the original integration.

Companion writeups: When the agent is in one container and its definition is in another (the substrate that lets policy and agent defs cross deployment boundaries), and Scrubbing the model's incoming mail (the content-scrubber PostTool hook that lives on top of the same hook contract any OpenAI-shim caller can plug into).