Status: Stable · v1.1 (2026-05-22). Normative spec for conformance-only host-sample test seams under
/v1/host/sample/*. Keywords MUST, SHOULD, MAY follow RFC 2119. Seeauth.mdfor the status legend.
OpenWOP's conformance suite verifies behavioral contracts that v1 cannot probe through the production wire surface alone. Examples:
- "the prompt resolution chain layered correctly" can be observed end-to-end via
prompt.composedevent payloads, but isolating layer-by-layer precedence requires a synchronous resolver endpoint - "the LLM cache-key recipe produced byte-identical output across hosts" can only be asserted if hosts expose their
canonicalize → SHA-256 → hexcomputation - "OTel span attributes don't carry BYOK canaries" requires an introspection endpoint scoped to a run
These contracts ship as conformance-only test seams under the host-extensions.md §"Canonical prefixes" namespace /v1/host/sample/*. They are NOT part of the v1 wire surface — production hosts SHOULD return 404 or 403 from these seams unless an env-gate (named per-seam below) is set.
This doc is the canonical reference for the test-seam contracts. Per-seam normative content also appears in the RFC + spec doc that introduces the seam; this doc is the consolidated index hosts implement against.
Capability advertisement (normative)
Hosts that expose any test seam MUST advertise it under /.well-known/openwop per capabilities.md. The advertising flags are tabulated below per seam. Conformance scenarios capability-gate on the matching flag; hosts that don't advertise skip cleanly.
Test seams
1. POST /v1/host/sample/prompt/resolve — Prompt resolution chain (RFC 0029)
| Field | Value |
|---|---|
| Method + path | POST /v1/host/sample/prompt/resolve |
| Capability gate | capabilities.prompts.supported: true |
| Env gate (reference impl) | seam registered when capabilities.prompts.supported is asserted |
| Introduced | RFC 0029 §C |
Request body:
{
kind: 'system' | 'user' | 'few-shot' | 'schema-hint',
node: {
nodeId: string,
config?: {
systemPromptRef?: string | PromptRef,
userPromptRef?: string | PromptRef,
schemaHintPromptRef?: string | PromptRef,
fewShotPromptRefs?: Array<string | PromptRef>,
agentId?: string,
},
},
agentManifest?: {
agentId: string,
systemPrompt?: string,
systemPromptRef?: string,
promptOverrides?: Partial<Record<PromptKind, string | PromptRef>>,
promptLibraryRef?: string,
},
workflowDefaults?: { promptRefs?: Partial<Record<PromptKind, string | PromptRef>> },
hostDefaults?: Partial<Record<PromptKind, string | PromptRef>>,
agentBindingsSupported?: boolean, // overrides capabilities.prompts.agentBindings for this probe
}
Response body:
{
resolved: string | null, // rendered prompt text after variable substitution, or null if all 4 layers yielded null
resolvedAt: 'node' | 'agent-intrinsic' | 'workflow' | 'host' | null, // which layer won
chain: Array<{ // every layer attempted, in priority order
layer: 'node' | 'agent-intrinsic' | 'workflow' | 'host',
ref: string | null,
resolved: string | null,
}>,
}
Hosts that advertise capabilities.prompts.supported: true MUST serve this seam with the documented shape. The chain[] array MUST list every layer attempted even when an earlier layer wins — conformance scenarios assert the full traversal record.
Conformance: prompt-resolution-chain-{node-wins,agent-intrinsic,fallback-cascade}.test.ts.
Production-path equivalent (preferred). The same layer-by-layer precedence record is carried by the durable agent.promptResolved event (schemas/run-event-payloads.schema.json agentPromptResolved — a REQUIRED chain[] with one applied: true entry + the full-traversal MUST). A host that emits the event is provable black-box via prompt-resolution-chain-event.test.ts, which creates a run and reads chain[] from the NORMATIVE GET /v1/runs/{runId}/events/poll endpoint — no seam. This synchronous seam remains the convenience for hosts that have not yet wired event emission (RFC 0029 staging); the production-path event is what graduates RFC 0029 prompt-chain precedence into the openwop-core-standard floor.
2. GET /v1/host/sample/test/otel/spans?runId=<id> — OTel span scrape (RFC 0034)
| Field | Value |
|---|---|
| Method + path | GET /v1/host/sample/test/otel/spans?runId=<id> |
| Capability gate | capabilities.observability.testSeams.otelScrape: true |
| Env gate (reference impl) | OPENWOP_TEST_OTEL_SCRAPE=true |
| Introduced | RFC 0034 §B |
Returns recorded OTel spans for the named run. When otelScrape: true, the host MUST return 200 OK with body:
{
spans: Array<{
name: string, // span name, e.g., "openwop.run", "openwop.dispatch"
attributes: Record<string, unknown>, // span attributes including any openwop.*-prefixed keys
events: Array<{ name: string, attributes?: Record<string, unknown> }>,
}>,
}
The spans[] array MUST include every span produced by the host's instrumentation for the named run, including any openwop.*-prefixed attributes added to span context. Hosts MAY redact span content using the canonical [REDACTED:<secretId>] marker per agent-memory.md §"SR-1 secret-redaction invariant" — that's the contract conformance tests.
The seam graduates two SECURITY invariants from reference-impl to protocol tier:
secret-leakage-otel-attribute— BYOK plaintexts MUST NOT appear as values on anyopenwop.*OTel attribute- (paired)
secret-leakage-debug-bundle-otel— same invariant on debug-bundle exports
Conformance: envelope-reasoning-secret-redaction.test.ts (capability-gated on the seam).
3. POST /v1/host/sample/test/debug-bundle/export — Debug-bundle export probe (RFC 0034)
| Field | Value |
|---|---|
| Method + path | POST /v1/host/sample/test/debug-bundle/export |
| Capability gate | capabilities.observability.testSeams.debugBundleExport: true |
| Env gate (reference impl) | OPENWOP_TEST_DEBUG_BUNDLE_EXPORT=true |
| Introduced | RFC 0034 §B |
Synchronous debug-bundle export for conformance scenarios that need to assert canary redaction without first triggering an interrupt → debug bundle workflow.
Request body:
{
runId: string,
}
Response body: same shape as GET /v1/runs/{runId}/debug-bundle per spec/v1/debug-bundle.md — DebugBundle with bundleVersion, host, run, events, redactionMode, redactionApplied, truncated, truncatedReason.
When advertised, the host MUST serve a 200 OK with the documented shape.
Conformance: gates on capabilities.observability.testSeams.debugBundleExport: true.
4. POST /v1/host/sample/test/llm-cache-key — LLM cache-key recipe (RFC 0041)
| Field | Value |
|---|---|
| Method + path | POST /v1/host/sample/test/llm-cache-key |
| Capability gate | capabilities.multiAgent.executionModel.replayDeterminism.supported: true (RFC 0041 Phase 4 hosts); MAY be implemented earlier without advertising |
| Env gate (reference impl) | implicit — seam registered alongside the cache-key implementation |
| Introduced | RFC 0041 §A |
Computes the canonical LLM cache key per replay.md §"LLM cache-key recipe" §A + §B. Conformance scenarios drive the seam to assert (a) intra-host reproducibility, (b) non-recipe-field invariance, and (c) cross-host parity when two hosts both expose the seam.
Request body — an LLMCacheKeyInput-shaped object per replay.md §A. Non-recipe fields are accepted and ignored (the test exercises that the host's recipe correctly drops them):
{
// Recipe fields (per replay.md §A — only these influence the key):
provider: string, // canonical provider id, lowercase ASCII
model: string, // provider-stamped model id
messages: Array<{ role, content, name?, toolCallId? }>,
tools?: Array<{ name, description?, parameters }>,
temperature?: number,
topP?: number,
topK?: number,
responseFormat?: { type: 'text' | 'json' | 'tool_call', schema? },
// Non-recipe fields (host MUST ignore for key computation):
max_tokens?: number,
stop?: string[],
stream?: boolean,
seed?: number,
metadata?: Record<string, unknown>,
user?: string,
'x-request-id'?: string,
// ... any other field
}
Response body:
{
cacheKey: string, // 64 lowercase-hex chars (SHA-256 of canonicalize(projectRecipe(input)))
}
Hosts MUST:
1. Drop non-recipe fields from the input before canonicalization (§A closed-set rule) 2. Canonicalize per replay.md §B (RFC 8785 JCS-style: sorted keys recursively, no whitespace, preserve array order, UTF-8 NFC strings) 3. Return SHA-256 over the canonical bytes as lowercase hex
A missing or malformed provider/model/messages field MUST return 400 invalid_argument.
Conformance: replay-llm-cache-key.test.ts, replay-llm-cache-key-portable.test.ts.
5. Staged-refusal seam — POST /v1/host/sample/test/mock-ai/program mode refusal (RFC 0041 §B)
| Field | Value |
|---|---|
| Method + path | POST /v1/host/sample/test/mock-ai/program |
| Capability gate | capabilities.multiAgent.executionModel.replayDeterminism.refusalDivergenceEmission: true (RFC 0041 Phase 4) |
| Env gate (reference impl) | OPENWOP_TEST_SEAM_ENABLED=true |
| Introduced | RFC 0041 §B; reuses the existing mock-AI program seam introduced by RFC 0032 §C |
The replay.divergedAtRefusal behavioral assertion requires staging the mock-AI provider to return a valid envelope on the original run and a refusal on the replay (or vice-versa). Phase 4 hosts that advertise refusalDivergenceEmission: true MUST honor the following program shape on POST /v1/host/sample/test/mock-ai/program:
{
nodeId: string,
program: [
{ mode: 'envelope', envelope: { /* valid LLM envelope */ } }, // original run gets this
{ mode: 'refusal', refusalReason: string }, // replay gets this
],
}
The host's mock-AI provider MUST honor the program deterministically by attempt index: the first call (original run) returns the first entry; the second call (replay) returns the second entry. The seam is callable BEFORE the run is created — each conformance scenario uses a unique fixture (and therefore unique nodeId).
When the replay's mock-AI call hits the refusal entry, the host MUST:
1. Emit a replay.divergedAtRefusal event with payload per schemas/run-event-payloads.schema.json §replayDivergedAtRefusal 2. Fail the replay with HTTP 422 + error.code: "replay_diverged_at_refusal"
Conformance: replay-divergence-at-refusal.test.ts (advertisement-shape probe lives now; the 2 behavioral it.todo assertions light up when this seam is wired).
6. Multi-region idempotency simulator — POST /v1/host/sample/test/multi-region/simulate-partition (RFC 0036 §C)
| Field | Value |
|---|---|
| Method + path | POST /v1/host/sample/test/multi-region/simulate-partition |
| Capability gate | capabilities.idempotency.multiRegion.supported: true OR capabilities.idempotency.crossRegion ∈ {best-effort, strict} (RFC 0036) |
| Env gate (reference impl) | OPENWOP_TEST_MULTI_REGION_SIMULATOR=true |
| Introduced | RFC 0036 §C — closes the CF-12 / OPS-5 multi-region simulation gap named in docs/KNOWN-LIMITS.md |
The convergence rule in spec/v1/idempotency.md §"Multi-region idempotency annex" §"Convergence rule" is a pure-function MUST: given ≥2 conflicting ConflictClaim records sharing (tenantId, endpoint, key), the resolver MUST return the lex-min runId as the winner deterministically without coordination. This seam exposes that algorithm directly so conformance can mechanically verify the property against synthetic partitions (no actual multi-region replication required).
Request:
{
claims: Array<{
runId: string, // engine-assigned id; lex-sort determines winner
tenantId: string, // claims with different tenantId MUST be rejected (400)
endpoint: string, // claims with different endpoint MUST be rejected (400)
key: string, // claims with different key MUST be rejected (400)
region: string, // identifies which region produced this claim
}> // length ≥ 2; length < 2 MUST be rejected (400)
}
Response (200 OK):
{
winner: ConflictClaim, // lex-min runId
losers: ConflictClaim[], // N-1 entries
cacheRedirects: Array<{ // N entries (one per region)
region: string,
cacheKey: string, // `${endpoint}:${key}`
redirectToRunId: string, // winner.runId
}>,
loserCancelReason: 'cross_region_dedup_loss', // canonical literal
}
Idempotency: the resolver is a pure function with no side effects. Same inputs → same outputs across calls. Hosts MAY cache results but the seam itself doesn't persist state.
Conformance: multi-region-idempotency-behavior.test.ts (6 assertions covering lex-min winner, multi-region cache redirects, canonical cancel reason, order-invariance, and 400-on-tuple-mismatch).
7. Cross-engine append-ordering harness — POST /v1/host/sample/test/cross-engine/{append,read,reset} (RFC 0036 §B)
| Field | Value |
|---|---|
| Method + path | 3 endpoints (see below) |
| Capability gate | capabilities.eventLog.crossEngineOrdering.supported: true (RFC 0036 §B) |
| Env gate (reference impl) | OPENWOP_TEST_CROSS_ENGINE_HARNESS=true |
| Introduced | RFC 0036 §B — closes the CF-8 cross-engine append-ordering gap named in docs/KNOWN-LIMITS.md |
The cross-engine ordering invariant in spec/v1/channels-and-reducers.md §"Cross-engine ordering" requires that two engine instances writing to the same shared channel converge to a single globally-ordered linearization on read. This seam exposes a synthetic two-engine harness so conformance can verify the property without standing up two real engine instances.
Endpoints:
POST /v1/host/sample/test/cross-engine/append
Body: { engineId: string, channelId: string, value: unknown, lamport?: number }
Returns: { engineId, value, lamport, seq } — the assigned timestamp + sequence
GET /v1/host/sample/test/cross-engine/read?channelId=<id>
Returns: { entries: AppendEntry[] } — linearized by (lamport, engineId, seq)
POST /v1/host/sample/test/cross-engine/reset
Body: {}
Returns: { ok: true } — clears the in-memory log
Lamport-clock semantics (the host's advertised orderingModel: 'lamport'):
- Each append advances the engine's clock to
max(local, incoming) + 1 - The
lamport?field onappendis the engine's view of the OTHER engine's clock (incoming hint); honored per the lamport receive rule readlinearizes by(lamport ASC, engineId ASC, seq ASC)— a deterministic total order- Hosts advertising a different
orderingModel(vector-clock,global-sequencer, orx-host-<host>-<key>) MAY substitute their own algorithm but MUST honor the sameappend/read/resetcontract
Conformance: cross-engine-append-behavior.test.ts (4 assertions covering global linearization, lamport monotonicity, receive-rule advancement, and read-determinism).
8. Sandbox MVP — POST /v1/host/sample/test/sandbox-{load,invoke} (RFC 0035)
| Field | Value |
|---|---|
| Method + path | 2 endpoints (see below) |
| Capability gate | capabilities.sandbox.supported: true (RFC 0035 §A) |
| Env gate (reference impl) | OPENWOP_TEST_SANDBOX_MVP=true |
| Introduced | RFC 0035 §B — exercises the 8 sandbox failure-mode invariants against a synthetic misbehaving-pack registry |
The sandbox seam exists so conformance can drive the §B failure-mode invariants without a real pack runtime + real misbehaving pack tarballs. Each sandbox-invoke request names a synthetic typeId from the host's pre-populated misbehaving-pack registry; the host executes the matching code body inside its sandbox and returns either the result or a typed error envelope per host-capabilities.md §"Error codes".
Endpoints:
POST /v1/host/sample/test/sandbox-load
Body: { packId: string }
Returns: 200 { ok: true, packId } | 400 validation_error | 404 sandbox_pack_not_found
POST /v1/host/sample/test/sandbox-invoke
Body: {
typeId: string, // e.g. 'misbehave.fs-escape-read'
args?: Record<string, unknown>, // available as `args` inside the sandboxed code
packId?: string, // identifies the pack containing typeId
allowedHostCalls?: string[], // capability-gate whitelist for this invocation
}
Returns: 200 { result: unknown } | 200 { error: SandboxError }
SandboxError shape (canonical per host-capabilities.md §"Error codes"):
{
code:
| 'sandbox_escape_attempt' // forbidden-syscall escape (fs/env/network/process)
| 'sandbox_capability_denied' // host call not in allowedHostCalls
| 'sandbox_memory_exceeded' // memoryLimitBytes overflow
| 'sandbox_timeout' // wallClockLimitMs overflow
| 'sandbox_invocation_error', // fallback for thrown errors not in the canonical catalog
details: {
escapeKind?: // SET when code === 'sandbox_escape_attempt'
| 'host-fs-escape'
| 'host-env-leak'
| 'network-escape'
| 'host-process-escape',
requestedCapability?: string, // REQUIRED when code === 'sandbox_capability_denied'
requestedBytes?: number, // MAY appear when code === 'sandbox_memory_exceeded'
message: string,
},
}
Synthetic misbehaving-pack typeIds the conformance suite exercises:
| typeId | Failure mode it probes |
|---|---|
misbehave.fs-escape-read | sandbox_escape_attempt + escapeKind: host-fs-escape |
misbehave.fs-escape-write | sandbox_escape_attempt + escapeKind: host-fs-escape |
misbehave.env-leak | sandbox_escape_attempt + escapeKind: host-env-leak |
misbehave.network-escape | sandbox_escape_attempt + escapeKind: network-escape |
misbehave.process-escape | sandbox_escape_attempt + escapeKind: host-process-escape |
misbehave.timeout | sandbox_timeout |
misbehave.memory-bomb | sandbox_memory_exceeded |
misbehave.cross-pack-mutate | (no failure; result.shared MUST equal 1 on every invocation — cross-pack mutation MUST NOT leak across fresh contexts) |
misbehave.capability-gate-violation | sandbox_capability_denied + details.requestedCapability |
well-behaved.echo | (no failure; result.echoed === args.input) |
well-behaved.host-fetch | (no failure when allowedHostCalls includes 'fetch') |
Conformance: sandbox-mvp-behavior.test.ts (10 assertions covering 5 escape kinds + timeout + memory + cross-pack isolation + capability-gate + 2 well-behaved baselines).
9. Workspace cross-owner driver — POST /v1/host/sample/workspace/op (RFC 0059)
| Field | Value |
|---|---|
| Method + path | POST /v1/host/sample/workspace/op |
| Capability gate | capabilities.workspace.supported: true (RFC 0059 §A) |
| Env gate (reference impl) | none (the in-memory host enables it unconditionally; production hosts gate per the §"Production safety" rule below) |
| Introduced | RFC 0059 §E — drives host.workspace CRUD against an EXPLICIT {tenant, workspace} owner so the workspace-cross-tenant-isolation (WCT-1) invariant is exercisable on a single-credential host (mirrors the blob/kv/queue/table cross-tenant seams) |
The production §C endpoints (/v1/host/workspace/files) bind every request to one authenticated owner, so a single-credential host cannot demonstrate cross-owner isolation through them. This seam takes the {tenant, workspace} owner in the body — letting a conformance scenario write as owner A and attempt a read as owner B — and routes through the SAME owner-scoped store the §C endpoints use. The host MUST still scope strictly by the supplied owner triple (WCT-1); the seam only supplies the triple that production resolves from the authenticated identity.
POST /v1/host/sample/workspace/op
Body: {
tenant: string, // owner tenant (RFC 0048)
workspace: string, // owner workspace
op: 'list' | 'get' | 'put' | 'delete',
path?: string, // required for get/put/delete
content?: string, // required for put
contentType?: string, // optional for put
ifMatch?: string, // optional optimistic-concurrency token for put
prefix?: string, // optional filter for list
version?: number, // optional historical read for get
}
Returns: the same body/status as the matching §C endpoint
(200 WorkspaceFile | 200 { files } | 204 | 404 not_found
| 409 workspace_conflict | 413 workspace_too_large)
Conformance: workspace-cross-tenant-isolation.test.ts (WCT-1 — write as owner A, then assert a different workspace AND a different tenant both fail closed on get/list, while the owner still reads its own file).
10. Connection-pack install/resolve/consent driver — POST /v1/host/sample/connection-packs/{install,resolve,consent-plan} (RFC 0095)
| Field | Value |
|---|---|
| Method + path | POST /v1/host/sample/connection-packs/install · POST /v1/host/sample/connection-packs/resolve · POST /v1/host/sample/connection-packs/consent-plan |
| Capability gate | capabilities.connections.packsSupported: true (RFC 0095 §C) |
| Env gate (reference impl) | seam registered when connections.packsSupported is asserted; production hosts gate per §"Production safety" |
| Introduced | RFC 0095 §Conformance — drives connection-packs.md §Manifest clauses 2/4/6/8 black-box on hosts whose install path is otherwise boot-time or publish-time |
Connection packs install through host-specific channels (a boot-time loader on the reference app; a registry publish path on other hosts), so the §Manifest clause 2/6/8 behaviors need a uniform driver for black-box conformance. The seams route through the SAME validation + resolution code paths the host's production install channel uses; they only supply the manifest (and, for resolve, an optional simulated built-in) that production sources elsewhere.
POST /v1/host/sample/connection-packs/install
Body: { manifest: <connection-pack manifest JSON> }
Returns: 200 {
installed: boolean,
errors?: Array<{ code: string, path?: string }>,
// code ∈ connection_pack_credential_material | pack_kind_invalid
// | schema-validation identifiers (host-specific)
}
POST /v1/host/sample/connection-packs/resolve
Body: {
provider: string, // the RFC 0045/0047 provider id
simulateBuiltinVersion?: string, // optional: behave as if a built-in
// definition of `provider` at this
// version existed (SemVer §11 probe)
}
Returns: 200 {
resolved: boolean,
source?: 'pack' | 'builtin',
version?: string,
code?: 'connection_provider_unresolved' | 'connection_provider_conflict',
}
POST /v1/host/sample/connection-packs/consent-plan
Body: { provider: string, requested: Array<'read' | 'write'> }
Returns: 200 {
steps: Array<{
groups?: Array<{ key: string, access: 'read' | 'write' }>,
includesWrite?: boolean,
}>,
}
The install seam MUST run the clause-2 credential-material scan BEFORE generic schema validation (the specific code wins); a rejected manifest is NOT installed and MUST NOT disturb other installed packs (clause 8). The resolve seam applies the clause-6 precedence rule (installed ≥ built-in per SemVer §11, else connection_provider_conflict). The consent-plan seam returns the host's planned consent sequence; write groups MUST occupy a separate step from the initial read authorization (clause 4).
Conformance: connection-pack-no-credential-material.test.ts (specific-code leg), connection-provider-resolution.test.ts (clauses 6 + 8), connection-pack-write-reconsent.test.ts (clause 4).
11. Reviewable-learning / goals / portability surfaces — /v1/host/sample/{proposals,goals,export,import} (RFCs 0096/0097/0098)
| Field | Value |
|---|---|
| Method + path | /v1/host/sample/proposals[...] (RFC 0096) · /v1/host/sample/goals[...] (RFC 0097) · GET /v1/host/sample/export · POST /v1/host/sample/import[?dryRun=] (RFC 0098) |
| Capability gate | capabilities.agents.proposals · capabilities.agents.goals · capabilities.portability |
| Env gate (reference impl) | seam registered when the matching capability is asserted; production hosts gate per §"Production safety". These are the floor surfaces, promotable to the normative /v1/{proposals,goals,export,import} paths at graduation (RFC 0086 precedent). |
| Introduced | RFCs 0096/0097/0098 §Conformance — black-box drivers for the inertness / bounded-continuation / no-secret-values behavioral legs |
# RFC 0096 — proposals
GET /v1/host/sample/proposals[?state=&kind=] → 200 { proposals: Proposal[] }
GET /v1/host/sample/proposals/{id} → 200 Proposal
PATCH /v1/host/sample/proposals/{id} → 200 Proposal # revise; MUST NOT activate
POST /v1/host/sample/proposals/{id}/apply → 200 { installedArtifactRef } | 403 (no scope) | 422 (malformed-for-kind)
POST /v1/host/sample/proposals/{id}/reject → 200 Proposal
DELETE /v1/host/sample/proposals/{id} → 200 Proposal # archive (soft)
# RFC 0097 — goals (no `complete`/`satisfy` write: completion is the judge's verdict)
GET /v1/host/sample/goals[?state=] → 200 { goals: Goal[] }
GET /v1/host/sample/goals/{id} → 200 Goal
POST /v1/host/sample/goals → 200 Goal | 422 (requiresBounds advertised + no bounds)
PATCH /v1/host/sample/goals/{id} → 200 Goal | 4xx (client-set state:satisfied refused)
POST /v1/host/sample/goals/{id}/{pause,resume,abandon} → 200 Goal
# RFC 0098 — portability
GET /v1/host/sample/export[?kinds=] → 200 ExportBundle # refs only, no secret values
POST /v1/host/sample/import?dryRun=true → 200 ImportPlan # no writes
POST /v1/host/sample/import → 200 ImportResult | 422 (literal credential value | dependsOn cycle) | 403 (no import scope)
The proposals/{id}/apply seam MUST install the byte image last persisted on the proposal (no re-synthesis — proposal-no-resynthesis) and MUST route activation through the advertised agents.proposals.activation mode. The goals POST seam MUST reject a bounds-less goal 422 when requiresBounds is advertised, and a client-supplied state: satisfied is refused on PATCH (goal-completion-judge-only). The import seam MUST reject a bundle whose connection-ref payload carries a literal credential value 422 BEFORE applying (export-bundle-no-credential-material), and ?dryRun=true MUST make zero writes.
Conformance: proposal-reviewable-learning.test.ts, goal-standing-continuation.test.ts, export-bundle-portability.test.ts (each soft-skips on 404 when the seam is unwired).
Production safety (normative)
All seams under /v1/host/sample/* are conformance-only. Hosts deployed in production:
- SHOULD return
404 Not Foundfrom every seam unless an env-gate explicitly enables it - MUST NOT honor the seams under default deployment configuration
- MUST document which env-gates were set for the conformance run in the host's
conformance.mdevidence file
The host-extension namespace /v1/host/sample/* is per host-extensions.md §"Canonical prefixes" — it is host-private space and does not affect the v1 wire-shape stability contract.
Canonical-endpoint conformance hooks
A handful of conformance assertions exercise wire-surface contracts that ride the canonical OpenWOP REST endpoints rather than a dedicated /v1/host/sample/ seam. These hooks need an operator-provided seed runId (or equivalent) communicated via an OPENWOP_TEST_ environment variable so the conformance driver can target a known refusal-eligible state without smuggling a host-private endpoint.
10. POST /v1/runs/{runId}:fork mode:replay against a past-retention runId (RFC 0039 §B MAE-3)
The MAE-3 contract is: a fork from a past event-log index MUST either serve memory-as-of that index OR refuse with 422 replay_memory_snapshot_unavailable per rest-endpoints.md §"Common error codes" — silent substitution of current memory is non-conformant.
The conformance driver targets the canonical fork endpoint with mode: "replay". The host's pre-flight order is normative for distinguishing this refusal from neighboring 422s:
1. checkFromSeqBounds(fromSeq, maxSeq) runs FIRST and returns 422 invalid_from_seq for fromSeq > maxSeq + 1. An impossible-fromSeq driver hits this gate, NOT MAE-3. 2. checkReplayMemorySnapshotPreflight(...) runs AFTER bounds-check and returns 422 replay_memory_snapshot_unavailable ONLY when the memory snapshot for an in-bounds fromSeq cannot be served — details.reason MUST be one of {"retention_expired", "event_log_unavailable"}.
Driving MAE-3 from outside therefore requires an actually-realized refusal-eligible state. Conventions:
| Hook | Env var | Realizes |
|---|---|---|
| Past-retention run | OPENWOP_TEST_EXPIRED_REPLAY_RUN_ID | A known runId whose event log has aged past the host's retention window; forking with mode: "replay" returns details.reason: "retention_expired". Operator provides the runId via env (parallel naming to the existing OPENWOP_TEST_EXPIRED_RUN_ID used by production-retention-expiry). |
| Event-log-unavailable run | (host-side fault-injection seam) | Not deterministically reproducible from outside — requires a host-side fault-injection seam to mark a run's event log unavailable. Documented here for completeness; no env-var convention yet. |
Envelope shape (normative; covered behaviorally in multi-agent-memory-lifecycle.test.ts):
{
"error": "replay_memory_snapshot_unavailable",
"details": {
"fromSeq": 0,
"sourceRunId": "<runId from the URL>",
"reason": "retention_expired"
}
}
details.reason MUST be one of {"retention_expired", "event_log_unavailable"}. The host MAY add additional optional fields under details; fromSeq MUST echo the requested fromSeq and sourceRunId MUST echo the runId from the URL.
Conformance: multi-agent-memory-lifecycle.test.ts (the MAE-3 behavioral assertion soft-skips when OPENWOP_TEST_EXPIRED_REPLAY_RUN_ID is unset OR the host does not advertise multiAgent.executionModel.version >= 2 + memory.supported: true).
Open seams (light up when fixtures ship)
- Memory cross-run TTL roundtrip seam (RFC 0039 MAE-2) —
POST /v1/host/sample/test/memory/cross-run-ttl-roundtrip. Contract: drive a parent → child → parent memory write/read sequence with controlled wall-clock skew to assert child-write-time TTL anchoring. Behavioral assertion inmulti-agent-memory-lifecycle.test.tsstaysit.todountil a memory-advertising Phase 2 host wires the seam. - Credential resolution + redaction seam (RFC 0046) —
POST /v1/host/sample/credentials/echo. Gated oncapabilities.credentials.supported. Contract: resolve a seeded credential whose plaintext is a known canary, run an echo node, and return the run's observable surfaces (events + inputs + variables + channels + snapshot + debug bundle). The behavioral assertion incredential-payload-redaction.test.tsasserts the canary is absent from every returned surface (SECURITY invariantcredential-payload-redaction); soft-skips on404until a credentials-advertising host wires the seam. - OAuth connector-echo seam (RFC 0047) —
POST /v1/host/sample/oauth/connector-echo. Gated oncapabilities.oauth.supported. Contract: a synthetic provider issues a token whose value is a known canary; a connector node runs; the run's observable surfaces (including theconnector.authorizedevent) are returned.oauth-connector-redaction.test.tsasserts the token canary is absent from every surface and thatconnector.authorizedcarries the credential reference, not the token (reuses thecredential-payload-redactioninvariant); soft-skips on404. - Run-ownership seam (RFC 0048) —
GET /v1/host/sample/identity/owned-run. Contract: return aRunSnapshotthat carries anownertriple.cross-workspace-isolation.test.tsasserts the owner echo carries a non-emptytenant; soft-skips on404(or whenowneris omitted by a single-tenant host). - Cross-workspace isolation seam (RFC 0048 §D) —
POST /v1/host/sample/identity/cross-workspace-read. Contract: aprincipalscoped to workspace A attempts to read a run owned by workspace B.cross-workspace-isolation.test.tsasserts the read fails closed withrun_forbidden/not_found(no existence leak); soft-skips on404until a workspace-ownership host wires the seam. - Authorization-decision seam (RFC 0049 §C) —
POST /v1/host/sample/authorization/decide. Gated oncapabilities.authorization.supported. Contract: request a decision ({ principal, action, resource }) for a principal whose role is absent/unseeded; the host MUST return{ allowed: false }(fail-closed).authorization-fail-closed.test.tsasserts the deny (SECURITY invariantauthorization-fail-closed); soft-skips on404until an authorization-advertising host wires the seam. - SAML assertion-validation seam (RFC 0050) —
POST /v1/host/sample/auth/saml/validate. Gated oncapabilities.auth.profiles[]includesopenwop-auth-saml+ an operator-supplied synthetic IdP (OPENWOP_TEST_SAML_IDP_URL). Contract: present an assertion of a namedvariant(valid,alg-none,bad-signature,unsigned,expired,not-yet-valid,signature-wrapping); the host MUST acceptvalidand reject every negative withunauthenticated.auth-saml-profile.test.tsdrives the negatives — the 1-positive + 6-negative assertions are minted by the bundled synthetic IdP harness (conformance/src/lib/saml-idp.ts), which also runs the negative reference suite server-free; the host-ACS path soft-skips on404/ absent env. - SCIM provisioning seam (RFC 0050) —
POST /v1/host/sample/auth/scim/provision. Gated oncapabilities.auth.profiles[]includesopenwop-auth-scim+ an operator-supplied SCIM endpoint (OPENWOP_TEST_SCIM_URL). Contract: drive a SCIMcreate-user/assign-group/deactivate-userop; the host MUST upsert an RFC 0048 principal / RFC 0049 role and deny a deactivated principal's subsequent decisions.auth-scim-profile.test.tsdrives the roundtrip; soft-skips on404/ absent env. - Approval-gate seam (RFC 0051) —
POST /v1/host/sample/governance/approval-gate. Gated oncapabilities.authorization.supported. Contract: drive a namedscenario(unauthorized-grant,grant,reject,override,quorum) against acore.openwop.governance.approvalGatenode; the host returns{ released, event }reflecting the outcome (an unauthorized principal MUST NOT release;overrideMUST emitapproval.overriddenwith areason+ an audit entry).approval-gate-flow.test.tsdrives unauthorized + override-audited; soft-skips on404until a governance-advertising host wires the seam. - Scheduling tick seam (RFC 0052) —
POST /v1/host/sample/scheduling/tick. Gated oncapabilities.scheduling.supported+cron: true. Contract: advance a deterministic clock for a namedscenario(single-tick,missed-windowwithmissedTicks) and return{ runsFired }— the count of runs a cron schedule produced. The host MUST reportrunsFired === 1for a single tick (once-per-tick) andrunsFired <= 1for a missed window (no backlog flood).scheduling-cron-fires-once.test.tsdrives both; soft-skips on404until a scheduling host wires the seam. (Delayed-execution horizon + calendar scenarios deferred.) - Heartbeat tick seam (RFC 0060) —
POST /v1/host/sample/heartbeat/tick. Gated oncapabilities.heartbeat.supported. Contract: evaluate a heartbeat predicate once for a request{ heartbeatId, observedState, simulateSlowMs? }(simulateSlowMsasks the predicate to overrunmaxRuntimeMs, exercising the §B.2 timeout path) and return{ evaluated: HeartbeatEvaluated[], stateChanged: HeartbeatStateChanged[], enqueuedRuns: number }— exactly oneevaluatedper tick (§B.1);stateChanged+enqueuedRunsnon-empty/non-zero ONLY whenobservedStatediffers from the prior tick's persisted state (§B.5, the anti-spam guarantee);evaluated[].status === "timeout"whensimulateSlowMsexceeds the budget (§B.2).heartbeat-fires-once-per-tick.test.ts/heartbeat-idempotent-no-spam.test.ts/heartbeat-runtime-bound.test.tsdrive these; soft-skip on404until a heartbeat host wires the seam. - Tool-hooks invoke seam (RFC 0064) —
POST /v1/host/sample/toolhooks/invoke. Gated oncapabilities.toolHooks.supported. Contract: evaluate the per-tool authorization + rate-limit gate for one call{ principal, toolName, requiredScopes?, args?, simulateRateLimitExhausted? }and return the{ toolCalled, toolReturned }payload pair the host would emit (the additive RFC 0064 fields on the existingagent.toolCalled/agent.toolReturnedevents).toolReturned.statusMUST beforbiddenwhen the principal lacks arequiredScopesentry (or authz is unevaluable — fail-closed, RFC 0049),rate_limitedwhensimulateRateLimitExhausted, elseokwith a non-negativedurationMs;toolCalled.argsHashMUST be a secret-redacted (SR-1) JCS+SHA-256 hash carrying no raw secret material.tool-hooks-content-free.test.ts/tool-hooks-authorization-fail-closed.test.ts/tool-hooks-rate-limit.test.ts/tool-hooks-secret-redaction.test.tsdrive these; soft-skip on404until a tool-hooks host wires the seam. - Sub-run attestation seam (RFC 0063) —
POST /v1/host/sample/subrun/attest. Gated oncapabilities.agents.subRunAttestation. Contract: drive one sub-workflow harvest-then-merge for a request{ childOutputs, outputAttestation: { checksum?, algorithm?, requireApproval?, principalScope? }, approvalAction? }and return{ attestation, harvestedEvent, merged, mergedValues? }— theattestation { checksum, algorithm }the host would surface oncore.workflowChain.event { phase: 'output.harvested' }, whether the merge proceeded, and the merged values. ThechecksumMUST be the RFC 8785 JCS + SHA-256 digest ofchildOutputs(byte-stable for identical inputs, host-independent). WhenrequireApproval: true,mergedMUST betrueonly forapprovalActionaccept/edit-acceptand MUST befalse(fail-closed) forrejector an absent/expired approval.subrun-checksum-stable.test.ts/subrun-approval-gate.test.ts/subrun-approval-fail-closed.test.tsdrive these; soft-skip on404until a sub-run-attestation host wires the seam. - Memory-distillation seam (RFC 0062) —
POST /v1/host/sample/memory/distill. Gated oncapabilities.memory.distillation.supported. Contract: run one budgeted distillation for a request{ memoryRef, tokenBudget?, sources?, indexEmitted?, includeSecretCanary? }and return{ event, archiveChecksum, indexUpdated, indexFile? }— thememory.compactedevent the host would emit (carrying the additivedistillation { tokenBudget, tokensUsed, indexUpdated }sub-object) plus the stable archive's checksum.event.distillation.tokensUsedMUST be ≤ the resolvedtokenBudget; an un-meetable budget MUST returntoken_budget_exceededwith no partial archive (atomic). The samesources+tokenBudgetMUST yield an identicalarchiveChecksum(byte-stable). WhenindexEmitted, aMEMORY-INDEX.jsonworkspace file MUST be retrievable and aworkspace.updatedevent fired. WhenincludeSecretCanary, a redacted secret in the sources MUST stay redacted in the archive (SR-1).distillation-token-budget.test.ts/distillation-stable-archive.test.ts/distillation-index-roundtrip.test.ts/distillation-secret-carryforward.test.tsdrive these; soft-skip on404until a distillation host wires the seam. - Dead-letter exhaustion seam (RFC 0053) —
POST /v1/host/sample/deadletter/exhaust. Gated oncapabilities.deadLetter.supported. Contract: drive a node that deterministically exhausts a short retry policy for a namedscenario(exhaust-retries,fork-after-dead-letter); the host returns{ event, forkEligible }— therun.dead_letteredevent (carryingattempts) and whether the dead-lettered run is forkable.deadletter-retry-exhaustion.test.tsdrives both; soft-skips on404until a dead-letter host wires the seam. (Retention-purge scenario deferred — needs a clock seam.) - Agent-loop seam (RFC 0061) —
POST /v1/host/sample/agentloop/run. Gated oncapabilities.multiAgent.executionModel.version >= 5. Contract: drive a bounded stateful loop for a request{ turns, workspaceWriteAtTurn?, suspendAtTurn?, resume? }and return{ decisions, workspaceVisible?, resumedIteration? }— the orderedrunOrchestrator.decidedpayloads the host would emit (each carrying theiterationcounter).decisions[k].iterationMUST equalk+1(1-based, monotonic, one per turn). WhenworkspaceWriteAtTurn: iis set (requireshost.workspace.supported),workspaceVisibleMUST report the write invisible to turn _i_'s snapshot and visible to turn _i+1_ (§C input 2). WhensuspendAtTurn+resumeare set (requiresstatefulResume: true),resumedIterationMUST equal the suspend iteration — the counter does not reset or skip (§D).agent-loop-iteration-monotonic.test.ts/agent-loop-workspace-snapshot.test.ts/agent-loop-stateful-resume.test.tsdrive these; soft-skip on404until a version-5 host wires the seam. - Runtime-requirement install-gate seam (RFC 0076 §A) —
POST /v1/host/sample/packs/install-gate. No capability flag (RFC 0076 §A adds a manifest field + host behavior, not an advertisement); soft-skips on404. Contract: evaluate a candidate manifest'sruntime.requires[]against a simulated host grant-set for a request{ manifest, grantSet?, gating? }and return the install-time outcome. Whengating !== false(sandbox host): if everyruntime.requiresentry is ingrantSetthe host MUST return200 { outcome: "installed" }; if any entry is not granted the host MUST refuse at install with400 { error: "pack_runtime_requirement_unmet", unmet: [...], manifest: "<name>@<version>", advice? }(thecapability_not_providedenvelope shape) — NOT install-and-fail-at-first-invocation. Whengating: false(non-sandbox host) the host installs unconditionally and SHOULD return200 { outcome: "installed", requiresProjected: [...] }, the declared requirements projected onto the inventory entry for operator visibility.runtime-requires-install-gate.test.tsdrives install-grant / install-refuse / non-sandbox-projection; soft-skip on404until a runtime-requires-gating host (MyndHyve is the first adopter) wires the seam. The pure-schema vocabulary rejection (runtime.requires: ["node:dns/promises"]→invalid_manifest) is covered server-free byruntime-requires-shape.test.ts. - Safe-fetch seam (RFC 0076 §B) —
POST /v1/host/sample/http/safe-fetch. Gated oncapabilities.httpClient.safeFetch.supported; soft-skips on404. Contract: evaluate onectx.http.safeFetchcall for a request{ url, init?, simulateRebindTo? }and return{ outcome, status?, blocked?, toolCalled?, toolReturned? }— the host applies the §host.http SSRF guard (resolve→pin→connect). The host MUST return{ outcome: "blocked", blocked: "ssrf" }for a loopback / RFC 1918 / link-local / cloud-metadata target AND for asimulateRebindTothat re-resolves a public name to a blocked address (DNS-rebinding); MUST return{ outcome: "blocked", blocked: "upgrade" }wheninit.headersrequestsConnection: upgrade; else{ outcome: "fetched", status }. Whencapabilities.toolHooks.prePostEventsis also advertised, a fetched call MUST include the{ toolCalled, toolReturned }pair (transport: "http").safefetch-behavior.test.tsdrives SSRF-block / rebinding / upgrade-refusal / audit-when-both; soft-skip on404until asafeFetchhost wires the seam. - Safe-fetch live-run audit seam (RFC 0076 §B / RFC 0064 §B) —
POST /v1/host/sample/http/safe-fetch-run. Gated oncapabilities.httpClient.safeFetch.supported+capabilities.toolHooks.prePostEvents(both); soft-skips on404. Distinct from the inlinesafe-fetchseam above: this seam executes onectx.http.safeFetchcall inside a real run through the host's _production_ per-ctx injection path (the samectx.http.safeFetcha node receives at dispatch), then returns{ runId, outcome }. Contract: for a request{ url, init? }the host MUST run onectx.http.safeFetchin a real run and return200 { runId, outcome }whereoutcomeis"fetched"(public target the guard allowed) or"blocked"(link-local / RFC-1918 / cloud-metadata target the SSRF guard refused); the conformance driver then reads the run's durable event log viaGET /v1/host/sample/test/runs/:runId/eventsand asserts acallId-pairedagent.toolCalled(transport: "http") /agent.toolReturnedwas persisted. The audit pair MUST be persisted for _every_ invocation —blockedas well asfetched(per §host.http "for every safeFetch invocation"; a refused egress attempt is itself a security-relevant event the durable log must capture).safefetch-live-audit.test.tsexploits this: it drives a guaranteed-blocked metadata URL as an egress-independent floor (reachable on any host with no outbound connectivity, so the bar can never pass vacuously on an egress-blocked host) plus a best-effort public fetch for success-path coverage. This closes the seam-vs-production gap insafefetch-behavior.test.ts(whose audit assertion reads only the inline seam echo): a host can pass the inline seam yet ship a productioncreateSafeFetch()with no audit hooks — the "quiet bypass" §host.http forbids.safefetch-live-audit.test.tsdrives it viabehaviorGate('openwop-safefetch-live-audit', …)so a host advertising both flags but not emitting to the durable log FAILS underOPENWOP_REQUIRE_BEHAVIOR=true; the seam itself soft-skips on404(host-pending) until asafeFetchhost wires it. This is the RFC 0076 §B → Accepted bar. Load-bearing host note: the audit pair MUST be emitted through the host's _durable_ run-event-log append path (the same path production tool calls use — e.g.getEventLog().append(runId, 'agent.toolCalled'|'agent.toolReturned', …)with RFC 0002 §BcallIdpairing +causationId), not captured-and-echoed inline like the non-runsafe-fetchseam above — otherwise the scenario reads the durable log, finds nothing, and correctly fails while the inline seam stays green. - Run event-log read seam (companion to the live-run seams above; used by
event-log-query.ts→queryTestEvents) —GET /v1/host/sample/test/runs/:runId/events. Conformance-only, env-gated; soft-skips on404(isEventLogSeamAvailable()). Contract: return the run's persisted events as{ events: TestEvent[] }(each{ eventId, runId, type, payload, timestamp, sequence, causationId?, nodeId?, contentTrust? }), optionally filtered by?type=&correlationId=&causationId=&nodeId=. The host MUST workspace-scope the read — refuse (or return empty for) arunIdoutside the caller's{tenant, workspace}, so the test seam is never a weaker cross-tenant disclosure path than production (matches theidentity// credential-echo seam RBAC precedent + the WCT-1 posture). Enforcement scope: like every/v1/host/sample/test/seam this is _reference-host-honored_, not protocol-tier —check-security-invariants.shcovers production surfaces, not conformance-only test seams, so no protocol-tier invariant gates this MUST; it inherits the same cross-tenant intent as the productionworkspace-cross-tenant-isolation(WCT-1) invariant and is the host operator's responsibility to uphold when wiring the seam. Read-only; no side effects. Already consumed by the RFC 0021 aiEnvelope engine-projection scenarios and now bysafefetch-live-audit.test.ts; a host that wires it un-soft-skips that whole cohort.
- Roster portfolio fire seam (RFC 0086 §C) —
POST /v1/host/sample/roster/fire. Gated oncapabilities.agents.roster.supported; soft-skips on404. Contract: fire one workflow in a roster member's portfolio for a request{ rosterId?, triggerSource?, asWorkItem? }(host picks a default member whenrosterIdis omitted) and return{ runId, rosterId, triggerSubscriptionId? }. The fired run MUST emitroster.run.initiatedas its FIRST attribution event — immediately afterrun.started, BEFORE anyagent.invocation./agent.event (§C ordering) — content-free per theroster-attribution-no-contentinvariant (ids + persona + trigger source ONLY; never the work-item body/prompt/credential). WhenasWorkItem: truethe fire takes the RFC 0083 durable-work-item path and the event MUST carrytriggerSubscriptionId(so trigger→run→roster is traceable via/ancestry, RFC 0040). The conformance driver reads the run's durable events via the run event-log read seam and asserts the ordering + content-free payload + work-itemtriggerSubscriptionId.agent-roster-attribution.test.tsdrives it viabehaviorGate('openwop-roster-attribution', …); the normativeGET /v1/agents/rosterread leg runs black-box on any roster host regardless of this seam. This is the RFC 0086 → Accepted bar (first adopter: MyndHyveagents.roster). - Live manifest-invocation seam (RFC 0077 §B/§E/§F) —
POST /v1/host/sample/agents/live-invoke. Gated oncapabilities.agents.liveRuntime.supported; soft-skips on404. Contract: drive one live manifest invocation for a request{ agentId?, source?, returnSchemaRef?, forceInvalidResult?, attemptTool? }(host picks a default agent whenagentIdis omitted) and return{ runId, invocationId, outcome? }. The invocation MUST bracket itsagent.*family withagent.invocation.startedas the FIRST agent-scoped event andagent.invocation.completedas the LAST (§E), sharing oneinvocationId, withsource∈ {workflow-node,run-api,chat-mention} andoutcome∈ {completed,handed-off,escalated,refused,failed} — both events content-free (identifiers + selection/outcome metadata only, never prompt or result body). WhenreturnSchemaRef+forceInvalidResult: trueare set (requiresliveRuntime.structuredOutput), the host MUST fail the invocation (completed.outcome === "failed",schemaValidated !== true) rather than ship a result that violateshandoff.returnSchemaRef(§B step 6). WhenattemptToolnames a tool OUTSIDE the agent'stoolAllowlist, the host MUST NOT call it (noagent.toolCalledfor that tool — the §F-1 / RFC 0002 §A14 allowlist floor). The conformance driver reads the durable run events via the run event-log read seam.agent-live-invocation-bracket.test.ts/agent-live-structured-output.test.ts/agent-live-allowlist-enforced.test.tsdrive these viabehaviorGate('openwop-live-invocation-bracket' | 'openwop-live-structured-output' | 'openwop-live-allowlist-enforced', …). This is the RFC 0077 → Accepted bar (first adopter: MyndHyveagents.liveRuntime). - Trigger-bridge delivery seam (RFC 0083 §C) —
POST /v1/host/sample/trigger-bridge/deliver. Profile-gated onopenwop-trigger-bridge(derived from discovery per §D — the bridge advertised + a dead-letter sink + a durable source); soft-skips on404. Contract: drive one delivery through the durable bridge for a request{ scenario, dedupKey?, source? }and return{ runId?, subscriptionId?, outcome?, deliveredCount? }, persisting thetrigger.delivery.attempted+trigger.subscription.state.changedevents to the durable run-event log (read back via the run event-log read seam).scenario: "dedup"delivers the samededupKeytwice and MUST be effectively-once (≤1trigger.delivery.attempted { outcome:"delivered" }for that key, §C-1);scenario: "exhaust"exhausts the retry policy and MUST terminate intrigger.delivery.attempted { outcome:"dead-lettered" }+trigger.subscription.state.changed { toState:"dead-lettered" }(§C-2 + RFC 0053);scenario: "deliver"performs one successful delivery whose resulting run'srun.startedMUST carrycausationId== the delivery id (§C / RFC 0040, resolvable via/ancestry). Bothtrigger.*events MUST be content-free (SR-1: ids/states/counters only — never inbound body/headers/credentials).trigger-bridge-delivery.test.tsdrives all three legs viabehaviorGate('openwop-trigger-bridge', …); the normativeGET /v1/trigger-subscriptionsread runs black-box regardless of this seam. This is the RFC 0083 → Accepted bar. - Eval-run seam (RFC 0081 §B/§C) —
POST /v1/host/sample/agents/eval-run. Gated oncapabilities.agents.evalSuite.supported; soft-skips on404. Contract: drive onemode:"eval"projection for a request{ agentId?, modes?, taskCount? }(host picks a default manifest agent + a built-in golden suite when omitted) and return{ runId, suiteId?, suiteVersion?, taskCount?, passed?, aggregateScore? }, persisting theeval.*family to the durable run-event log (read back via the run event-log read seam). The eval run MUST emiteval.startedas the FIRST eval event, oneeval.scoredPER TASK (after that task's terminalagent.decided), andeval.completedONCE beforerun.completed(§C ordering:eval.started.sequence< everyeval.scored.sequence<eval.completed.sequence; theeval.scoredcount ==eval.completed.taskCount). Everyeval.scoredMUST be content-free (score∈ 0..1,passedboolean, ids/scalars ONLY — NEVER task output, rubric prose, or model completion; SR-1 /eval-summary-no-content-leak). The terminal run output MUST be a schema-validEvalSummary(eval-summary.schema.json) readable via the NORMATIVEGET /v1/runs/{runId}/eval-summary, withpassedCount <= taskCountand no per-task output body.agent-eval-run.test.tsdrives it viabehaviorGate('openwop-eval-run', …); the normative eval-summary read runs black-box regardless of this seam. This is the RFC 0081 → Accepted bar (first adopter: MyndHyveagents.evalSuite). - Deployment-transition seam (RFC 0082 §B/§E) —
POST /v1/host/sample/agents/deployment-transition. Gated oncapabilities.agents.deployment.supported; soft-skips on404. Contract: drive one deployment transition for a request{ scenario, agentId?, version?, channel?, evalRunId? }and return{ runId?, record?, allowed?, error?, resolvedAgentVersion? }, persisting thedeployment.family (+agent.invocation.started) to the durable run-event log (read back via the run event-log read seam).scenario: "promote"runs the §E contract (authorize RFC 0049deploy:promote→ RFC 0051 approvalGate → RFC 0081 eval-verify whenevalRunIdset) and MUST emit a content-freedeployment.promotedwhosetoStateis in the seven-state vocabulary + carriestoVersion; the returnedrecordMUST validate againstagent-deployment.schema.json.scenario: "unauthorized"drives a principal lackingdeploy:promoteand MUST fail closed (allowed:false, NOdeployment.promoted— thedeployment-promotion-fail-closedinvariant).scenario: "eval-gate-unmet"drives a promote whoseevalRunIdhasEvalSummary.passed:falseand MUST deny witherror:"eval_gate_unmet"+ NOdeployment.promoted(§E-3).scenario: "channel-pin"starts a@channel-bound run whose resolved version is recorded asresolvedAgentVersiononagent.invocation.started(§B — the recorded fact a replay re-reads rather than re-resolving). Alldeployment.events MUST be content-free (SR-1: ids/state/scalars only — never a manifest body/prompt/credential).agent-deployment-lifecycle.test.tsdrives all four legs viabehaviorGate('openwop-deployment-lifecycle', …); the normativeGET /v1/agents/{agentId}/deploymentsread runs black-box regardless of this seam. This is the RFC 0082 → Accepted bar (first adopter: MyndHyveagents.deployment). - Tool-session seam (RFC 0078 §D) —
POST /v1/host/sample/tools/session-run. Gated oncapabilities.toolCatalog.sessionLifecycle; soft-skips on404/405. Contract: drive one tool-session interaction for a request{ toolId? }(host picks a default catalog tool when omitted) and return{ runId, sessionId?, toolId? }, persistingtool.session.opened→ the RFC 0064 call events (agent.toolCalled/agent.toolReturned) →tool.session.closedto the durable run-event log (read back via the run event-log read seam).tool.session.openedMUST precede the FIRST call event andtool.session.closedMUST follow the LAST (§D bracket ordering), both sharing onesessionId, each carrying atoolId, withtool.session.closed.outcome∈ {completed,failed,aborted,expired}. Both events MUST be content-free (SR-1: ids/outcome ONLY — never tool args/result/credential).tool-session-lifecycle.test.tsdrives it viabehaviorGate('openwop-tool-session-lifecycle', …); the normativeGET /v1/toolscatalog read runs black-box regardless of this seam. This is part of the RFC 0078 → Accepted bar (first adopter: MyndHyvetoolCatalog). - Egress-decision seam (RFC 0079 §C) —
POST /v1/host/sample/egress/decide. Gated oncapabilities.httpClient.egressPolicy.supported; soft-skips on404/405. Contract: drive one egress-policy decision for a request{ scenario }and return{ decision?, reason?, destination?, credentialAttached?, canaryLeaked? }— the host evaluates a host-issued credential's RFC 0079 §Aaudiences[]provenance against the egress destination.scenario: "out-of-audience"(credential bound to audience A, egress to B ∉ A) MUST returndecision∈ {denied,downgraded} +reason: "out-of-audience"and MUST NOT attach the credential (credentialAttached !== true— the §C confused-deputy MUST backing theegress-credential-audience-boundinvariant).scenario: "provenance-unevaluable"MUST returndecision: "denied"+reason: "provenance-unevaluable"(fail-closed).scenario: "in-audience"is the control (MAYallowed).scenario: "canary"seeds a credential whose value is a known sentinel and the host MUST NOT surface it (canaryLeaked !== true) nor spill the blocked URL/host/header into the decision (SR-1);decision∈ the closed enum +reason∈ the CLOSED vocabulary throughout.egress-audience-binding.test.ts(keystone) +egress-decision-content-free.test.tsdrive these viabehaviorGate('openwop-egress-audience-binding' | 'openwop-egress-decision-content-free', …). This is the RFC 0079 → Accepted bar (first adopter: MyndHyvehttpClient.egressPolicy). Egress policy layers over the RFC 0076 §BsafeFetchSSRF guard — no new normative read endpoint. - Memory-consolidation seam (RFC 0068 §D) —
POST /v1/host/sample/memory/consolidate. Gated oncapabilities.agents.memoryConsolidation.supported; soft-skips on404/501. Contract: run one background-consolidation pass for a request{ memoryRef, includeSecretCanary? }and return{ event: { inputCount, outputCount }, secretLeaked? }, emitting theagent.memory.consolidatedevent (durable-append, like the live-run seams). A merge/dedup pass MUST haveoutputCount <= inputCount(§D.1); a second pass over the unchanged corpus MUST be a no-op (inputCount == outputCount— the §D.2 idempotence MUST that bounds runaway consolidation); whenincludeSecretCanary, a redacted secret in a source entry MUST stay redacted in the consolidated entry (secretLeaked: false— §D.3 / agent-memory.md §SR-1 carry-forward).memory-consolidation-idempotent.test.tsdrives it via the capability gate. This is part of the RFC 0068 → Accepted bar (first adopter: MyndHyveagents.memoryConsolidation). - Commitment-fire seam (RFC 0068 §C) —
POST /v1/host/sample/commitment/fire. Gated oncapabilities.agents.commitments.supported; soft-skips on404/501. Contract: fire one inferred standing commitment for a request{ memoryRef, condition, includeIntentionCanary? }and return{ event: { commitmentId, memoryRef, condition }, fireCount?, intentionCanary? }, emitting thecommitment.firedevent (durable-append). The event MUST carrycommitmentId+ the sourcememoryRef(§C.1 CTI-1 provenance) +condition; it MUST be content-free — the inferred intention text MUST NOT appear anywhere on the event payload (§C.3; the seam MAY echo the plaintext as the top-levelintentionCanaryONLY so the driver can assert its absence fromevent); a commitment MUST fire at most once per satisfied condition (fireCount <= 1, §C.2).commitment-fired.test.tsdrives it via the capability gate. This is part of the RFC 0068 → Accepted bar (first adopter: MyndHyveagents.commitments). - Budget-run seam (RFC 0084 §C/§D) —
POST /v1/host/sample/budget/run. Gated oncapabilities.budget.supported; soft-skips on404/501. Contract: drive one budgeted run for a request{ scenario }and return{ runId?, outcome?, error?, modelCalled? }, persisting thebudget.+cap.breached+run.failedfamily to the durable run-event log (read back via the run event-log read seam). Budget consumption is tracked OFF the existing RFC 0026provider.usagestream (no double-counting).scenario: "hard-cost-exhaust"(requiresenforce:"hard",dimensions:["cost"]) MUST emit, in strict sequence,budget.reserved {effectiveBudget, scope}→budget.consumed {dimension:"cost", consumed, limit, remaining}→budget.threshold.crossed {dimension:"cost", percent}→budget.exhausted {dimension:"cost"}→cap.breached {kind:"budget-cost", limit, observed}→run.failed {error:"budget_exhausted"}(the §D hard-stop, reusing the unifiedcap.breachedoverflow event per the RFC 0058 precedent).scenario: "model-denied"drives a run whose resolved model violatesbudget.modelDeny/modelAllow; the host MUST refuse withbudget_model_deniedBEFORE the provider call (modelCalled !== true,modelDenywins on conflict — fail-closed, composing RFC 0031 + RFC 0067 at the dispatch seam).scenario: "advisory"(requiresenforce:"advisory") MUST emit thebudget.events but MUST NOT stop the run (nocap.breached{budget-}, norun.failed{budget_exhausted}). Everybudget.payload MUST be content-free (SR-1 /budget-no-pricing-leak: dimension/limit/consumed/remaining/percent scalars only — NEVER provider pricing tables / per-token rates / cost-model internals). The §E orthogonality with RFC 0058 is normative —budgethas no wall-time/iteration dimension.budget-enforcement.test.tsdrives it viabehaviorGate('openwop-budget-enforcement', …). This is the RFC 0084 → Accepted bar (first adopter: MyndHyvebudget).
Open spec gaps
- Capability flag for the prompt resolver seam is implicit (always-on when
prompts.supported: true). A future minor revision MAY addcapabilities.prompts.testSeams.promptResolveif hosts want to advertise the seam without committing to the full RFC 0029 behavior. - The staged-refusal seam shape extends the existing RFC 0032 mock-AI program shape with a new
mode: "refusal"entry. A future revision MAY split this out as a dedicatedcapabilities.multiAgent.executionModel.testSeamsblock.
Cross-references
host-extensions.md§"Canonical prefixes" — the/v1/host/sample/*namespace contractcapabilities.md§"Truthful advertisement" — the host's commitment when it advertises any of the above flagshost-capabilities.md§"capabilities.observability.testSeams" — the OTel scrape + debug-bundle export capability sub-blockobservability.md§"OTel collector test seam (RFC 0034)" — the canonical RFC 0034 §B normative text the OTel + debug-bundle seams implementreplay.md§"LLM cache-key recipe" — the canonical recipe the §4 LLM cache-key seam computesprompts.md§"Resolution chain (normative)" — the canonical RFC 0029 resolver semantics the §1 seam exposes