OpenWOP openwop.dev

Status: Stable · v1.1 (2026-04-27). Comprehensive coverage of the canonical openwop.* attribute namespace, span naming conventions, and metric kinds. Stable surface for external review. Keywords MUST, SHOULD, MAY follow RFC 2119. See auth.md for the status legend.


Why this exists

External implementers and operators need a shared vocabulary for tracing and metrics so that:

1. Dashboards built against one openwop server work against another. 2. SDKs can correlate client-side spans with server-side spans without per-vendor mapping tables. 3. Conformance tests can verify "the server emits a span named openwop.node.<typeId> with attribute openwop.node_id" without ambiguity.

openwop defines the canonical openwop. attribute namespace. Implementations MAY alias to vendor-specific taxonomies (e.g., langgraph. for LangSmith integration, dd.* for Datadog) per deployment, but the spec does not prescribe a mapping. Vendor bridges are the deployer's responsibility.


Trace context propagation

An OpenWOP-compliant server MUST honor and emit W3C Trace Context headers on every HTTP request and SSE event:

HeaderDirectionPurpose
traceparentBothStandard W3C trace + span ID + sampled flag
tracestateBothVendor-specific trace state (opaque to openwop)

Servers SHOULD propagate traceparent through the engine into:

  • Every NodeModule execution span
  • Every external API call (AI providers, webhooks)
  • Every event log append (so durable events carry the originating trace)

Clients SHOULD include traceparent on outbound requests so server-side spans nest under the client's parent span.


Export protocols

Hosts that emit OTLP telemetry MUST support at least http/json and http/protobuf and MAY additionally support grpc. Hosts MAY advertise the supported transports under capabilities.observability.otel.exportProtocols (an array drawn from {"http/json", "http/protobuf", "grpc"}). Collectors MAY use this to pick a compatible exporter without probing.

TransportWireContent-TypeNotes
http/jsonHTTP/1.1 POSTapplication/jsonDefault. Required of all OTel-emitting hosts.
http/protobufHTTP/1.1 POSTapplication/x-protobufRequired of all OTel-emitting hosts. Compact binary; same shape as http/json decoded.
grpcHTTP/2 unary RPC (h2c or TLS)application/grpc+protoOptional. Service paths: /opentelemetry.proto.collector.trace.v1.TraceService/Export and /opentelemetry.proto.collector.metrics.v1.MetricsService/Export. Messages are length-prefixed protobuf per the gRPC HTTP/2 protocol.

The openwop conformance suite ships a collector for all three transports under conformance/src/lib/otel-collector.ts (hand-rolled, zero npm-dep). See conformance/README.md §"Optional environment flags" for the relevant OPENWOP_OTEL_COLLECTOR* env vars.


Span attributes

An OpenWOP-compliant server emitting OTel spans for engine activity MUST use the following canonical attributes. Implementations MAY add their own attributes outside the openwop.* namespace; the spec only constrains what's inside it.

Run-level attributes

Set on every span emitted during a run's lifecycle:

AttributeTypeRequiredNotes
openwop.run_idstringMUSTRun ID (e.g., run_abc123)
openwop.workflow_idstringMUSTWorkflow ID the run is executing
openwop.protocol_versionstringSHOULDServer's openwop protocol version
openwop.tenant_idstringMAYTenant/workspace scoping (if applicable)
openwop.scope_idstringMAYProject/scope correlation (if applicable)

Node-level attributes

Set on spans scoped to a single node execution:

AttributeTypeRequiredNotes
openwop.node_idstringMUSTNode ID within the workflow
openwop.node_typestringMUSTNode typeId (e.g., core.ai.callPrompt)
openwop.node_attemptnumberMUSTZero-based retry counter
openwop.event_seqnumberSHOULDSequence number of the most recent event for this node

Event-level attributes

Set on spans that emit a specific event:

AttributeTypeRequiredNotes
openwop.event_typestringMUSTEvent type (e.g., node.completed, approval.received)
openwop.event_seqnumberMUSTSequence number assigned on append

HITL attributes

Set on spans involving human-in-the-loop suspensions:

AttributeTypeRequiredNotes
openwop.interrupt_kindstringMUSTOne of approval, clarification, external-event, custom
openwop.interrupt_idstringMUSTSuspension ID
openwop.interrupt_countnumberSHOULDPer-(run, node) counter for replay determinism

Capability-limit attributes

Set on spans where a CapabilityLimitExceededError was thrown:

AttributeTypeRequiredNotes
openwop.cap_kindstringMUSTOne of clarification, schema, envelopes, node-executions; wasm-memory / wasm-fuel / wasm-execution-time (RFC 0008 §K); run-duration / loop-iterations (RFC 0058)
openwop.cap_limitnumberMUSTThe limit value
openwop.cap_observednumberMUSTThe observed value when the limit fired

Replay / branch attributes

Set on the openwop.run span of a run created via POST /v1/runs/{runId}:fork:

AttributeTypeRequiredNotes
openwop.replay.source_run_idstringMUSTRunId of the run this fork was derived from.
openwop.replay.from_seqnumberMUSTSequence number we forked at (inclusive — events < from_seq are fixed history).
openwop.replay.modestringMUSTreplay (re-execute exactly) or branch (re-execute with runOptionsOverlay).

Span linkage: the forked run's openwop.run span MUST carry an OTel Link to the source run's openwop.run span (via the source's traceId + spanId). This is the OTel-canonical way to express "this new trace was derived from that other trace without a parent-child causal relationship" — replays are NOT causal children of the original (the user's :fork request causes them, not the original run). Trace viewers (Honeycomb, Tempo, Jaeger) render the Link natively.

Operators can answer questions like "show me all replay-mode forks of run X" or "show me runs that diverged at sequence > 100" by aggregating on the three openwop.replay.* attributes — no trace-graph query required.

Privacy classification attributes (closes O5)

Set on spans / events / metric records carrying potentially sensitive data, so observability collectors can apply the deployer's policy (retention, masking, export gating) before forwarding to long-term storage.

AttributeTypeRequiredNotes
openwop.pii_presentbooleanSHOULDComputed aggregate. true when ANY input, output, variable, channel write, or activity payload on this span / event has a sensitivity marker (per Privacy classification §below). Servers SHOULD set on every span where the answer is determinable; MAY omit when uncertain.
openwop.compliance_classstringSHOULDTop-level workflow classification from WorkflowMetadata.complianceClass. One of public, pii, phi, pci, regulated. Single string per run — applies to ALL spans the run produces.
openwop.sensitive_fieldsstring[]MAYNames of sensitive fields touched by this span (e.g., ["variables.userEmail", "channels.feedback"]). Useful for fine-grained audit; high cardinality so collectors typically drop in aggregation.

Aggregate computation rules for openwop.pii_present:

  • The engine MUST set openwop.pii_present: true on the openwop.run span when the workflow declares metadata.complianceClass !== 'public' OR any variable.sensitive, channel.sensitive, or pack-level node.outputs[port].sensitive is true.
  • On openwop.node.<typeId> spans: true when the node consumes from OR writes to a sensitive variable / channel / output port.
  • On openwop.activity.<provider> spans: true when the activity payload contains a sensitive field (e.g., a userEmail flowing into an LLM call).

Compliance class semantics:

ClassMeaningTypical retention
publicNo sensitivity; default. Trace data may be retained indefinitely.Per deployer's standard policy.
piiPersonal data per GDPR/CCPA scope (names, emails, behavioral data).Shorter retention; right-to-erasure tooling MUST be aware.
phiProtected Health Information per HIPAA.Encrypted at rest; access-logged.
pciPayment card data per PCI DSS.Tokenized; raw values MUST NOT appear in observability.
regulatedOther regulated categories the deployer manages (export-controlled, attorney-client, etc.).Deployer-defined policy.

The spec doesn't enforce retention or storage rules — those are the deployer's collector / backend policy. The spec only guarantees the _signal_: a collector inspecting a span's attributes can route / mask / drop based on openwop.pii_present + openwop.compliance_class without parsing payload contents.

See "Privacy classification" §below in the main Span attributes series for the underlying field-marker layer.

Sub-workflow attributes (closes O2)

Set on the openwop.run span of a child run started by a parent workflow's invoke-style node (sub-workflow dispatch, cross-canvas-invoke, etc.):

AttributeTypeRequiredNotes
openwop.parent.run_idstringMUSTRunId of the parent run.
openwop.parent.workflow_idstringMUSTWorkflowId of the parent run.
openwop.parent.node_idstringMUSTNodeId of the invoke node in the parent that spawned this child.

Span linkage: parent-child causal nesting. The child run's openwop.run span MUST be set as a _child span_ of the parent's invoke-node openwop.node.<typeId> span (via OTel parentSpanId). Sub-workflow invocation IS causal — the parent's invoke-node spawns the child — so parent-child nesting is semantically correct AND is what operators want visually. Clicking the parent's invoke-node span in Honeycomb / Tempo / Jaeger drills into the child run, exactly like clicking a function call drills into the function body.

This contrasts with the replay/branch case above (Span Link, sibling-style) because replays are NOT causal children of the source run — the user's :fork request is the cause. For sub-workflows, the parent's invoke-node IS the cause.

Propagation mechanism. The parent engine emits the invoke-node span with traceparent set; when starting the child run (via REST POST /v1/runs, MCP tools/call, or A2A invoke), the parent MUST forward that traceparent to the child engine. The child engine's first span (openwop.run) MUST use the forwarded traceparent as its parent reference — the same W3C Trace Context propagation flow already specced in §Trace context propagation.

The child engine SHOULD also emit the three openwop.parent.* attributes alongside the parent reference — letting dashboards filter / aggregate ("show me all child runs spawned by workflow X" or "show me invoke-node failures by parent.node_id") without graph queries.

Cross-link with channels-and-reducers.md §Distributed reducers: a child run's channel.written events carry sourceEngineId + sourceRunId (from C2's cross-engine writes). When operators trace from a parent's channel-write trigger fire back to the child write that caused it, the trace's parent-child span structure makes the connection one click — no manual run-ID correlation required.


Canonical run lifecycle event names

An OpenWOP-compliant server emits run-lifecycle events through the event log (GET /v1/runs/{runId}/events*) and through structured logs / OTel spans. The wire-level event-type names form a closed vocabulary that external clients and SDKs can rely on:

Event typeWhenDefault severityRequired
run.startedRun transitions from pending to runninginfoMUST
run.completedRun reaches terminal completedinfoMUST
run.failedRun reaches terminal failederrorMUST
run.cancelledRun reaches terminal cancelledinfoMUST
node.startedNode execution beginsdebugSHOULD
node.completedNode execution succeedsdebugSHOULD
node.failedNode execution fails (terminal for the node)errorSHOULD
node.cancelledNode execution stops via cancelinfoSHOULD
approval.requestedHITL approval gate opensinfoSHOULD (if openwop-interrupts)
approval.receivedHITL approval resolvedinfoSHOULD (if openwop-interrupts)
clarification.requestedLLM emits a clarification envelopeinfoSHOULD (if openwop-interrupts)
clarification.resolvedClient provides clarification answerinfoSHOULD (if openwop-interrupts)
cap.breachedEngine-enforced limit exceedederrorSHOULD (if maxNodeExecutions enforced)
channel.writtenChannel write succeedsdebugSHOULD (if channels supported)
run.replay.startedReplay/fork is initiatedinfoSHOULD (if openwop-replay)
memory.compactedA MemoryAdapter compaction run completes (RFC 0012)infoSHOULD (if capabilities.memory.compaction.supported: true)
memory.writtenA run writes a memory entry; attributes it to the node/agent (identifiers only, never content) (RFC 0057)infoMUST (if capabilities.memory.attribution.emitsWriteEvents: true)

Severity vocabulary. OpenWOP adopts the standard four-tier severity model: debug / info / warn / error. Severities are advisory — observability platforms apply their own escalation rules — but the defaults above let a downstream consumer treat unrecognized events with conservative severity policy.

Closed vocabulary. Hosts MUST NOT emit additional event types under the run., node., approval., clarification., cap., channel., or replay. prefixes without an RFC. Vendor-specific events are permitted under namespaced prefixes (e.g., openwop.audit.*) per spec/v1/host-extensions.md.

Terminal events. A run MUST emit exactly one of run.completed / run.failed / run.cancelled and that event MUST be the last event in the stream. The conformance scenario eventOrdering.test.ts pins this contract.

Forward-compat. Clients consuming the event stream MUST treat unknown event types as opaque and continue reading. Hosts MAY add new event types in v1.x if they're additive (no behavior change for clients that ignore the new type) per COMPATIBILITY.md §2.1.


Span naming

An OpenWOP-compliant server SHOULD use these canonical span names. Implementations MAY use additional names outside the openwop.* prefix.

Span nameWhen emittedParent
openwop.runTop-level span for an entire runnone (or client trace)
openwop.node.<typeId>Wraps a single node executionopenwop.run
openwop.node.<typeId>.attemptWraps one retry attempt within a nodeopenwop.node.<typeId>
openwop.event.appendWraps EventLog.appendAtomicnearest active span
openwop.interruptWraps a HITL suspension (open until resumed)openwop.node.<typeId>
openwop.activity.<provider>Wraps an external API call (e.g., openwop.activity.openai)nearest active span

Span names with <typeId> substitute the actual node type — e.g., openwop.node.core.ai.callPrompt.


Structured-log metric records (lightweight)

In addition to OTel metrics (defined in the next section), an OpenWOP-compliant server SHOULD emit structured-log records with the following metricKind field. These are the cheap-to-emit complement: logs-based, ingested by most observability platforms natively, useful for ad-hoc querying when a full metrics pipeline isn't deployed.

metricKindWhenRequired fields
openwop.run.createdAfter successful POST /v1/runsrunId, workflowId, tenantId?
openwop.run.completedOn terminal status (completed/failed/cancelled)runId, status, durationMs
openwop.run.claim.conflictOn X-Dedup 409 conflicttransport, projectId, activeRunId, activeHost, retryAfterSeconds
openwop.node.completedPer node completionrunId, nodeId, nodeType, status, durationMs, attempt
openwop.activity.invokedPer external API callrunId, nodeId, provider, status, latencyMs, idempotencyHit?
openwop.cap.exceededWhen CapabilityLimitExceededError firesrunId, kind, limit, observed
openwop.cost.recordedAfter every billable AI activity (closes O4; see "Cost attribution attributes" §)runId, nodeId, provider, tokensInput, tokensOutput, usd?, currency?, estimated?
openwop.mcp.invocationPer MCP tool callinvocationId, tenantId, moduleId, uid?, status, errorCode?, latencyMs

OpenTelemetry metrics (full)

Format follows OpenTelemetry Semantic Conventions style: each metric declares an instrument, unit (UCUM code), description, applicable attributes, recommended histogram boundaries (when applicable), and a stability tier.

An OpenWOP-compliant server SHOULD emit all Stable metrics. Experimental metrics MAY be emitted; consumers MUST tolerate their addition or removal in v1.x patch releases.

Attribute cardinality conventions

The metric attribute tiers below reuse the canonical openwop.* span attributes from §Span attributes. Cardinality bounds:

AttributeCardinalityUse as metric attribute?
openwop.run_idUNBOUNDED (1 per run)NEVER. Use exemplars to link metric points back to traces.
openwop.workflow_idTenant-bounded (typically <100 per tenant)Recommended.
openwop.node_idWorkflow-bounded (typically <50 per workflow)Opt-in — may explode at scale. Aggregations SHOULD prefer openwop.node_type.
openwop.node_typePack-bounded (typically <50 globally; <500 with vendor packs)Recommended.
openwop.tenant_idPlatform-bounded (one per tenant)Required for multi-tenant deployments. Consumers MAY drop at aggregation if cardinality budget is tight.
openwop.scope_idTenant-boundedOpt-in.
provider (activities)Bounded enum (openai, anthropic, google, …)Required for activity metrics.

Run lifecycle metrics

openwop.run.created

FieldValue
InstrumentCounter
Unit1 (count)
DescriptionNumber of runs accepted by POST /v1/runs. Increments BEFORE the run begins executing — covers both runs that complete and runs that fail to start.
Attributes (Required)openwop.workflow_id, openwop.tenant_id (if multi-tenant)
Attributes (Recommended)openwop.scope_id
StabilityStable

openwop.run.completed

FieldValue
InstrumentCounter
Unit1 (count)
DescriptionNumber of runs that reached a terminal status. Discriminate via the openwop.run_status attribute.
Attributes (Required)openwop.run_status (completed \
Attributes (Recommended)openwop.tenant_id
StabilityStable

openwop.run.duration

FieldValue
InstrumentHistogram
Units (seconds)
DescriptionWall-clock duration from POST /v1/runs accept to terminal status. Includes time suspended on HITL interrupts — operators wanting "active execution time only" should pair with openwop.node.duration aggregations.
Attributes (Required)openwop.run_status, openwop.workflow_id
Attributes (Recommended)openwop.tenant_id
Recommended buckets (s)[0.5, 1, 2.5, 5, 10, 30, 60, 300, 600, 1800, 3600] (0.5s — 1h)
StabilityStable

openwop.run.active

FieldValue
InstrumentUpDownCounter
Unit1 (count)
DescriptionNumber of in-flight runs (status NOT in completed/failed/cancelled). Increments on POST /v1/runs accept; decrements on terminal transition.
Attributes (Required)openwop.tenant_id (if multi-tenant)
Attributes (Recommended)openwop.workflow_id
StabilityStable

Node lifecycle metrics

openwop.node.completed

FieldValue
InstrumentCounter
Unit1 (count)
DescriptionNumber of node executions that reached a terminal node status.
Attributes (Required)openwop.node_type, openwop.run_status (completed \
Attributes (Recommended)openwop.workflow_id, openwop.tenant_id
StabilityStable

openwop.node.duration

FieldValue
InstrumentHistogram
Units (seconds)
DescriptionPer-node execution duration. Per-attempt (a node with 3 retries records 3 samples).
Attributes (Required)openwop.node_type, openwop.run_status
Attributes (Recommended)openwop.node_attempt (zero-based)
Recommended buckets (s)[0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10, 30, 60] (1ms — 1min)
StabilityStable

openwop.node.attempts

FieldValue
InstrumentCounter
Unit1 (count)
DescriptionNumber of retry attempts on a node. Counts only attempts strictly after the first; a node that succeeds first try contributes 0.
Attributes (Required)openwop.node_type
Attributes (Recommended)openwop.workflow_id
StabilityStable

Activity (external API call) metrics

openwop.activity.invocations

FieldValue
InstrumentCounter
Unit1 (count)
DescriptionNumber of external API calls (LLM, payment, webhook). Discriminates by provider.
Attributes (Required)provider (e.g., openai, anthropic, google), openwop.run_status (success \
Attributes (Recommended)openwop.node_type
StabilityStable

openwop.activity.duration

FieldValue
InstrumentHistogram
Units (seconds)
DescriptionWall-clock duration of a single external API call.
Attributes (Required)provider, openwop.run_status
Recommended buckets (s)[0.01, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10, 30, 60, 120] (10ms — 2min)
StabilityStable

openwop.activity.tokens

FieldValue
InstrumentCounter
Unit{token} (UCUM custom unit; OTel-style annotated count)
DescriptionLLM tokens billed. Pairs with observability.md §Cost attribution attributes (O4) — same numbers, different aggregation level.
Attributes (Required)provider, direction (input \
Attributes (Recommended)openwop.cost.estimated (boolean — true when computed server-side rather than provider-returned)
StabilityStable

Capability-limit metrics

openwop.cap.exceeded

FieldValue
InstrumentCounter
Unit1 (count)
DescriptionNumber of CapabilityLimitExceededError occurrences, broken down by limit kind. Useful for "are we tuning limits too tight?" SLOs.
Attributes (Required)openwop.cap_kind (clarification \
Attributes (Recommended)openwop.workflow_id, openwop.node_type
StabilityStable

Run-claim metrics

openwop.run.claim.conflicts

FieldValue
InstrumentCounter
Unit1 (count)
DescriptionNumber of X-Dedup: enforce 409 conflicts. Useful for "are clients retrying too aggressively?" SLOs.
Attributes (Required)transport (rest \
StabilityStable

HITL metrics

openwop.interrupt.requested

FieldValue
InstrumentCounter
Unit1 (count)
DescriptionNumber of HITL suspensions emitted.
Attributes (Required)openwop.interrupt_kind (approval \
Attributes (Recommended)openwop.workflow_id, openwop.node_type
StabilityStable

openwop.interrupt.duration

FieldValue
InstrumentHistogram
Units (seconds)
DescriptionWall-clock time from suspension request to resolution (or timeout). Note the wide bucket range — HITL is slow by nature.
Attributes (Required)openwop.interrupt_kind, openwop.run_status (resolved \
Recommended buckets (s)[60, 300, 900, 1800, 3600, 14400, 86400, 604800] (1min — 1week)
StabilityStable

Queue depth and backlog metrics

For hosts that surface their execution queue to OTel — REQUIRED for hosts claiming the production scale tier (see scale-profiles.md).

openwop.queue.depth

FieldValue
InstrumentGauge
Unitruns
DescriptionInstantaneous count of runs in the host's pending/runnable queue (excludes paused, waiting-*, terminal runs). Sample at scrape time.
Attributes (Required)openwop.tenant_id (or sentinel &quot;all&quot; for aggregate)
Attributes (Recommended)openwop.queue_name (host-defined; e.g., &quot;default&quot;, &quot;priority-high&quot;)
StabilityStable

openwop.run.backlog

FieldValue
InstrumentHistogram
Unitseconds
DescriptionTime between run.created and run.started (queue-wait duration). Captures backlog independent of openwop.queue.depth so dashboards see tail latency, not just average depth.
Attributes (Required)openwop.tenant_id, openwop.workflow_id
Recommended buckets[0.05, 0.1, 0.5, 1, 5, 30, 60, 300, 1800]
StabilityStable

openwop.queue.enqueued

FieldValue
InstrumentCounter (monotonic)
Unitruns
DescriptionCumulative count of runs enqueued. Pair with openwop.run.completed (existing) to compute drain ratio.
Attributes (Required)openwop.tenant_id
StabilityStable

Orchestrator decision metrics (RFC 0006)

Gated on capabilities.orchestrator.supported: true.

openwop.orchestrator.decisions

FieldValue
InstrumentCounter (monotonic)
Unitdecisions
DescriptionCount of runOrchestrator.decided events emitted, partitioned by decision.kind. Lets operators see the orchestrator's behavior shape (mostly delegating, mostly asking the user, mostly terminating) at a glance.
Attributes (Required)openwop.tenant_id, openwop.workflow_id, openwop.orchestrator.decision_kind{&quot;next-worker&quot;,&quot;ask-user&quot;,&quot;terminate&quot;}
Attributes (Recommended)openwop.orchestrator.agent_id
StabilityStable

openwop.orchestrator.iterations

FieldValue
InstrumentHistogram
Unititerations
DescriptionPer-run distribution of runOrchestrator.decisionsTaken at terminal. Operators use this to tune iterationCap per RFC 0006 §A.
Attributes (Required)openwop.tenant_id, openwop.workflow_id
Recommended buckets[1, 2, 5, 10, 25, 50, 100, 250]
StabilityStable

Idempotency / cross-region metrics

Gated on capabilities.idempotency.crossRegion ∈ {&quot;best-effort&quot;,&quot;strict&quot;}.

openwop.idempotency.cross_region_conflicts_total

FieldValue
InstrumentCounter (monotonic)
Unitconflicts
DescriptionCount of cross-region idempotency conflicts resolved per idempotency.md §"Multi-region idempotency". A non-zero rate indicates partition-time divergence.
Attributes (Required)openwop.tenant_id, openwop.route, openwop.region_pair (string id like &quot;us-east:eu-west&quot;)
StabilityStable

openwop.idempotency.partition_seconds

FieldValue
InstrumentGauge
Unitseconds
DescriptionEstimated cache divergence in seconds. Operators alert when this exceeds the route's tolerance.
Attributes (Required)openwop.region_pair
StabilityStable

Cost attribution metrics

The cost-attribution metrics below pair with the openwop.cost.* attributes (see "Cost attribution attributes" §). Promoted from Experimental → Stable on 2026-04-27 alongside O4 closure.

openwop.cost.usd

FieldValue
InstrumentCounter (monotonic)
UnitUSD
DescriptionCumulative cost in USD. Use only when the server can derive cost from a published rate card; omit rather than guess.
Attributes (Required)provider, openwop.cost.estimated
Attributes (Recommended)openwop.tenant_id, openwop.workflow_id
StabilityStable

Privacy classification (closes O5)

The privacy classification surface gives workflow authors + NodeModule packs explicit ways to mark fields as sensitive. The engine reads those markers to compute the openwop.pii_present / openwop.compliance_class / openwop.sensitive_fields span attributes (defined in §Span attributes above) AND to apply masking when persisting events.

Workflow-level: metadata.complianceClass

WorkflowMetadata.complianceClass declares the top-level sensitivity tier of the entire workflow:

{
  &quot;metadata&quot;: {
    &quot;complianceClass&quot;: &quot;phi&quot;   // &#39;public&#39; (default) | &#39;pii&#39; | &#39;phi&#39; | &#39;pci&#39; | &#39;regulated&#39;
  }
}

This is the workflow-author's claim about what kind of data flows through. Sets openwop.compliance_class on every span the run produces. Persists with the workflow definition; reviewable at workflow-register time.

Field-level markers

Three places authors can mark individual fields as sensitive:

1. Workflow variablesWorkflowVariable.sensitive: boolean:

{
  &quot;variables&quot;: [
    { &quot;name&quot;: &quot;userEmail&quot;, &quot;type&quot;: &quot;string&quot;, &quot;sensitive&quot;: true },
    { &quot;name&quot;: &quot;totalScore&quot;, &quot;type&quot;: &quot;number&quot; }
  ]
}

When true, the engine masks the variable's value in persisted variable.changed events, state.snapshot projections, and the projected RunSnapshot.variables returned by GET /v1/runs/{runId}. Reads inside the workflow's NodeModule executors work normally — only persistence and external surfaces mask.

2. Per-node output overridesWorkflowNode.outputSensitivity:

{
  &quot;id&quot;: &quot;ai-1&quot;,
  &quot;typeId&quot;: &quot;core.ai.callPrompt&quot;,
  &quot;outputSensitivity&quot;: {
    &quot;draftEmail&quot;: true,
    &quot;tokensUsed&quot;: false
  }
}

When a normally-non-sensitive node receives sensitive data IN THIS WORKFLOW (e.g., a generic core.ai.callPrompt rendering a PHI-bearing prompt template), the workflow author marks specific output ports without changing the underlying NodeModule. Engine masks the marked output-port values in node.completed event payloads.

3. Pack-level output declaration — pack manifest nodes[].outputs[&lt;port&gt;].sensitive: boolean:

{
  "name": "vendor.acme.salesforce-tools",
  "nodes": [
    {
      "typeId": "vendor.acme.salesforce.upsert",
      "outputs": {
        "ssn": { "sensitive": true }
      }
    }
  ]
}

When a NodeModule ALWAYS handles sensitive data (a Salesforce upsert always touches PII), the pack author declares it once in the manifest. Workflows using this typeId inherit the markers automatically; outputSensitivity overrides at the workflow level if needed.

4. Channel sensitivityChannelDeclaration.sensitive: boolean:

{
  "channels": {
    "phiNotes": { "reducer": "feedback", "sensitive": true }
  }
}

When true, channel.written event payloads have their value field masked. The reduced channel state in RunSnapshot.channels is also masked when read via the REST surface.

Masking behavior

The engine's masking mode is server policy, advertised via Capabilities.compliance.defaultMode:

ModeBehavior
mask (default)Replace value with the literal string "[REDACTED]".
omitDrop the field entirely from the persisted payload.
hashReplace with "sha256:&lt;hex&gt;" so audit trails can detect equality without revealing the value.
passthroughRecord values as-is. Use only when a downstream collector handles masking. NOT recommended for production.

An OpenWOP-compliant server SHOULD:

1. Default to mask for any field marked sensitive. 2. Apply masking BEFORE the event reaches the durable event log (so leaks via the log itself are prevented). 3. Apply the same mode consistently within a single run (so replays produce identical event logs).

Servers MAY allow per-workflow overrides via metadata.complianceConfig.maskingMode — useful when a workflow needs hash-based audit but the server default is mask.

Replay implications

Sensitive fields are NOT replay-deterministic by default — replays can't see the original values, so any execution path that branches on a masked field MAY diverge. Authors who need replay-deterministic sensitive data SHOULD:

  • Use external secret storage (vault) and re-resolve during replay via a deterministic key.
  • OR use hash masking mode (audit-only equality) instead of mask / omit (which lose information).

Replay tooling MUST surface a warning when a :fork operation re-executes from a sequence that depended on a masked field — the replay may produce different outputs than the original. The replay.diverged event (already in the RunEvent enum) is the structured signal.

What this is NOT

  • The spec does NOT enforce retention or storage rules — those are deployer's collector / backend policy.
  • The spec does NOT detect PII automatically. Authors and pack maintainers MUST annotate fields. Auto-detection (regex-based, ML-based) is a vendor-pack feature, not a spec feature.
  • The classification class enum is intentionally small (5 values). Industry-specific subdivisions (HIPAA's 18 PHI identifiers, GDPR's "special categories") are NOT modeled at the spec level — those are domain-specific extensions in metadata.complianceConfig.

Reference implementation status (non-normative)

Non-normative. This section describes how operators can bridge legacy or host-private attribute names into the canonical openwop. namespace. It does NOT modify the canonical openwop. requirement above. New implementations SHOULD emit openwop.* directly.

Deployments consuming traces from a legacy implementation that used dotted attribute names such as openwop.workflow.id, openwop.run.id, or openwop.pauseRun.outcome can apply a per-deployment OTel collector aliasing rule:

# OTel collector config — alias host-private attributes to canonical openwop.*
processors:
  attributes/openwop_canonical:
    actions:
      - key: openwop.workflow_id
        from_attribute: openwop.workflow.id
        action: insert
      - key: openwop.run_id
        from_attribute: openwop.run.id
        action: insert
      # ... per-attribute mapping

Spec-compliant implementations MUST emit the canonical attributes directly; the aliasing pattern above is for migration only and is not normative.

Vendor aliasing (out of scope)

Operators who deploy OpenWOP-compliant servers and also use commercial observability platforms (Datadog, Honeycomb, LangSmith, etc.) typically need to alias openwop.* attributes to vendor-specific taxonomies. This is per-deployment configuration, NOT spec'd. Recommended pattern:

  • Run an OpenTelemetry Collector between the server and the vendor backend.
  • Apply an attributes processor that copies/renames openwop.* to the vendor's namespace.

Example aliasing rule (collector config snippet):

processors:
  attributes/aliasing:
    actions:
      - key: langgraph.thread_id
        from_attribute: openwop.run_id
        action: insert
      - key: langgraph.checkpoint_ns
        from_attribute: openwop.workflow_id
        action: insert

Spec compliance does NOT require any such mapping. A server that emits only openwop.* attributes is fully compliant; the operator chooses whether to bridge.


Implementer guidance

An OpenWOP-compliant server SHOULD:

1. Use a single OTel SDK instance for the lifetime of the process. 2. Configure the OTel resource with service.name matching the implementation's published name (e.g., @your-org/openwop-engine). 3. Set service.version to the published implementation version. 4. Sample spans according to OTEL_TRACES_SAMPLER env conventions; default to parentbased_traceidratio=0.1 (10% sampling). 5. Emit logs at info level for openwop.* metricKind records and error level for CapabilityLimitExceededError and unhandled failures.

An OpenWOP-compliant client (CLI, SDK) SHOULD:

1. Generate a traceparent for every command that issues a request. 2. Display the trace ID in error messages so operators can search backend traces. 3. Surface openwop.run.claim.conflict events as user-actionable retry prompts.


Cost attribution attributes (closes O4)

For AI-driven activities (core.ai.callPrompt, core.ai.generateFromPrompt, openwop.activity.&lt;provider&gt; spans), servers SHOULD attach the following attributes when the underlying provider call returns billable usage info:

AttributeTypeRequiredNotes
openwop.cost.tokens.inputnumberSHOULDInput/prompt tokens billed.
openwop.cost.tokens.outputnumberSHOULDOutput/completion tokens billed.
openwop.cost.tokens.totalnumberMAYConvenience sum; consumers can compute themselves.
openwop.cost.usdnumberMAYEstimated cost in USD. Servers SHOULD use a published rate card per model; if pricing is unavailable, omit rather than guess.
openwop.cost.currencystringMAYISO 4217 code when openwop.cost.&lt;currency&gt; is non-USD (default usd).
openwop.cost.estimatedbooleanMAYTrue when the cost was server-side computed rather than returned by the provider.
openwop.cost.providerstringSHOULDProvider name for cost attribution roll-up (e.g., openai, anthropic, google). Same value as the provider in openwop.activity.&lt;provider&gt; span name.

Aggregation guidance: dashboards SHOULD roll up openwop.cost.tokens.* and openwop.cost.usd by openwop.workflow_id, openwop.tenant_id, openwop.scope_id, and openwop.cost.provider. The dimension cardinality is bounded by tenant/project counts and the (small) provider list; safe for OTel histograms.

metricKind extension:

metricKindWhenRequired fields
openwop.cost.recordedAfter every billable AI activityrunId, nodeId, provider, tokensInput, tokensOutput, usd?, currency?, estimated?

Privacy: cost attributes MUST NOT include the prompt/response text (use openwop.cost.tokens.* for billable counts, never substring excerpts).

Allowlist enforcement: hosts that emit openwop.cost.* attributes onto OTel spans MUST route the emission through an allowlist sanitizer that drops any attribute name outside the canonical set enumerated in the table above (openwop.cost.tokens.input, openwop.cost.tokens.output, openwop.cost.tokens.total, openwop.cost.usd, openwop.cost.currency, openwop.cost.estimated, openwop.cost.provider). The sanitizer MUST also drop non-primitive values (objects, arrays, null, undefined, functions, symbols) — cost attributes are flat primitives. The intent is defense-in-depth: a buggy upstream that smuggles a credential-shaped value into an unfamiliar key name (e.g., openwop.cost.leaked_token) MUST NOT see that value reach observability. Enforced by SECURITY/invariants.yaml row cost-attribution-allowlist-redaction + the public cost-attribution.test.ts conformance scenario.


Provider usage events (RFC 0026)

The OTel openwop.cost.* attribute group above is the observability sibling; the durable event-log sibling is the provider.usage event type added by RFC 0026. Hosts that advertise capabilities.providerUsage.supported: true MUST emit exactly ONE provider.usage event per LLM provider invocation, BEFORE the corresponding node.completed. The event carries required {provider, model, inputTokens, outputTokens} plus optional {totalTokens, costEstimateUsd, currency, cacheHit, nodeId, traceId}. Hosts that don't advertise the capability omit the event entirely; old consumers that ignore unknown event types are unaffected per COMPATIBILITY.md §2.1.

The event is REPLAY-DETERMINISTIC for inputTokens + outputTokens (drawn from the cached provider response on replay); costEstimateUsd MAY be omitted on replay even when the original emission included it, since the host's rate table may have changed between runs. The OTel projection (§"Cost attribution attributes" above) is RECOMMENDED but NOT REQUIRED — hosts MAY emit only the event when they don't run an OTel exporter.

The payload MUST NOT carry credentialRefs, hashed credential identifiers, or prompt/response substrings — same redaction posture as the OTel attributes per SECURITY/threat-model-secret-leakage.md §SR-1. Enforced by SECURITY/invariants.yaml row provider-usage-no-credential-leak.


Envelope-reliability events (RFC 0032)

Six cross-kind operational RunEventType entries standardizing the protocol vocabulary for envelope-emission reliability behavior — retry attempts, retry exhaustion, refusals, truncations, NL-to-Format fallback engagement, and lenient-parsing recovery. Defined in RFC 0032; see ai-envelope.md §"Envelope-reliability events" for the normative spec.

Hosts that advertise capabilities.envelopes.reliability.supported: true MUST emit envelope.retry.exhausted and envelope.refusal (the two MUST-tier events). The other four (envelope.retry.attempted, envelope.truncated, envelope.nlToFormat.engaged, envelope.recovery.applied) are SHOULD/MAY-tier and listed in events[] only when the host actually emits them.

Hosts SHOULD project the events into the existing OTel attribute group on the envelope-emitting node's span:

EventOTel attribute group
envelope.retry.attemptedopenwop.envelope.retry.attempt (integer) + openwop.envelope.retry.reason (string)
envelope.retry.exhaustedopenwop.envelope.retry.total_attempts + openwop.envelope.retry.final_reason
envelope.refusalopenwop.envelope.refusal.safety_category (string, when present). refusalText is omitted from OTel by default — see §"Trust boundary + redaction" below
envelope.truncatedopenwop.envelope.truncated.stop_reason + openwop.envelope.truncated.output_token_count
envelope.nlToFormat.engagedopenwop.envelope.nl_to_format.fallback_calls
envelope.recovery.appliedopenwop.envelope.recovery.path + openwop.envelope.recovery.byte_offset (when present)

The event log is the load-bearing surface (for replay determinism + webhook subscribers); the OTel projection is supplementary. Hosts that don't run an OTel exporter MAY emit only the events.

Trust boundary + redaction

Event payloads that carry diagnostic strings (previousError, finalError, refusalText) MUST be passed through the same SR-1 redaction harness applied to envelope payloads per ai-envelope.md §"Redaction (SR-1 carry-forward)". The envelope.refusal.refusalText field is particularly load-bearing — provider safety-refusal messages can echo offending prompt content. The OTel projection of envelope.refusal omits refusalText by default; operators who want refusal text in dashboards plumb it through their own pipeline where they own the redaction policy.

SECURITY invariants envelope-refusal-no-prompt-leak (high severity) and envelope-recovery-no-content-leak (high severity) enforce this discipline (gate timing: lands with reference-host implementation, per the RFC 0027 §G staging precedent).


Envelope-completion retry routing (RFC 0033)

Companion to the envelope-reliability event vocabulary above. RFC 0033 normates the retry-routing semantics — specifically the truncation-vs-schema-violation distinction that hosts that advertise capabilities.envelopes.reliability.completion.distinguishesTruncation: true MUST honor:

  • Truncation (stop_reason: max_tokens or equivalent) → retry with INCREASED output budget (RECOMMENDED 2× multiplier, configurable via capabilities.envelopes.reliability.completion.truncationBudgetMultiplier); MUST NOT include a corrective schema fragment in the retry's system prompt.
  • Schema violation (clean stop + payload doesn't validate) → retry with corrective system fragment describing the validator's failure; MUST NOT increase the output budget.

Both paths count against capabilities.limits.schemaRounds. Exhaustion in the truncation path emits envelope.retry.exhausted { finalReason: "truncation" } + cap.breached { kind: "schema" } + node fails with error code envelope_truncation_unrecoverable. Exhaustion in the schema-violation path emits envelope.retry.exhausted { finalReason: "schema-violation" } + cap.breached + node fails with envelope_invalid (renamed from envelope_payload_invalid per the 2026-05-21 RFC adoption-feedback amendment). Refusal path (RFC 0032 §B.3) is terminal — NO retry — and fails with envelope_refusal (renamed from envelope_refused_by_provider).

See spec/v1/rest-endpoints.md §"Common error codes" for the two new codes; ai-envelope.md §"Envelope-completion criteria" for the normative completion criteria.


OTel collector test seam (RFC 0034)

Per RFC 0034 (Active 2026-05-21).

Cross-host conformance scenarios need an introspection endpoint to verify that BYOK canaries do not leak into OTel span attributes or debug-bundle exports. The two protocol-tier SECURITY invariants secret-leakage-otel-attribute and secret-leakage-debug-bundle-otel (SECURITY/invariants.yaml) graduate from reference-impl to protocol tier on the strength of this test seam.

The seams live under the host-extensions.md §"Canonical prefixes" namespace /v1/host/sample/test/* and are NOT part of the v1 wire surface. Production hosts SHOULD return 404 or 403 from the seam unless an env-gate (e.g., OPENWOP_TEST_OTEL_SCRAPE=true) is set.

GET /v1/host/sample/test/otel/spans?runId=&lt;id&gt;

When capabilities.observability.testSeams.otelScrape: true, the host MUST return 200 OK with body { spans: Array&lt;{ name, attributes, events }&gt; }. The spans array MUST include every OTel span produced by the host's instrumentation for the named run, including any openwop.*-prefixed attributes added to span context. Hosts MAY redact span content using the canonical [REDACTED:&lt;secretId&gt;] marker per agent-memory.md §"SR-1 secret-redaction invariant" — that's the contract being tested.

POST /v1/host/sample/test/debug-bundle/export

When capabilities.observability.testSeams.debugBundleExport: true, the host MUST return 200 OK with the same payload shape as GET /v1/runs/{runId}/debug-bundle per spec/v1/debug-bundle.md. The seam exists to give conformance scenarios a synchronous endpoint they can hit without first triggering an interrupt → debug bundle workflow.

Capability advertisement (normative)

Hosts that implement either seam advertise it under /.well-known/openwop:

{
  "capabilities": {
    "observability": {
      "testSeams": {
        "otelScrape": true,
        "debugBundleExport": true
      }
    }
  }
}

A host that advertises testSeams.otelScrape: true but returns 404 / 5xx from the seam is non-conformant. Hosts that do NOT implement the seam MUST omit the field (or set it to false); conformance scenarios skip cleanly when the capability is absent.

Quality signals (RFC 0056)

Observability above covers _what an agent did_; annotations cover _whether a human (or a supervisor agent) judged it good_. RFC 0056 defines a non-blocking quality signal — rating / correction / label / flag — attached to a run, event, or node, recorded via POST /v1/runs/{runId}/annotations and surfaced live via the run.annotated SSE notification.

Annotations are a per-run side-resource, NOT entries in the replayable run event log (so they never enter fork/replay; see replay.md). A host that advertises capabilities.feedback.supported: true MUST:

  • record annotations tenant-scoped — an annotation is visible only within its run's tenant (SECURITY invariant annotation-cross-tenant-isolation);
  • redact secret-shaped material in signal.correction and note before persistence, listing, and export, per SR-1 (SECURITY invariant annotation-content-redaction);
  • audit-log each recording with the acting principal (auth.md).

Consumers derive quality metrics (correction rate, mean rating, flag rate) from this surface; they complement — but are distinct from — the openwop.* telemetry spans/metrics above. See RFCS/0056.

Open spec gaps

#GapOwner
O1Full OTel metric definitions — done (2026-04-27: 13 metrics defined in semconv style under "OpenTelemetry metrics (full)" §, with instrument / unit / attributes / recommended histogram buckets / stability tier per metric. All 13 Stable as of O4 promotion). Cardinality bounds documented per attribute.
O2Sub-workflow span linkage — done (2026-04-27: child openwop.run is a parent-child span of the invoke-node's openwop.node.&lt;typeId&gt; (causal nesting); three required attributes openwop.parent.run_id, openwop.parent.workflow_id, openwop.parent.node_id. Parent forwards W3C traceparent on REST/MCP/A2A invocation. See "Sub-workflow attributes" §).
O3Replay/branch span linkage — done (2026-04-27: forked openwop.run carries an OTel Link to the source span + three required attributes openwop.replay.source_run_id, openwop.replay.from_seq, openwop.replay.mode. See "Replay / branch attributes" §).
O4Cost attribution attributes — done (2026-04-27: typed openwop.cost.tokens.* + openwop.cost.usd + openwop.cost.estimated attributes; openwop.cost.recorded log metric; openwop.cost.usd OTel metric promoted Experimental → Stable).
O5Privacy classification — done (2026-04-27: full surface — three span attributes (openwop.pii_present, openwop.compliance_class, openwop.sensitive_fields) + workflow-level metadata.complianceClass + field markers on variables/nodes/channels/pack outputs + four masking modes (mask/omit/hash/passthrough) advertised via Capabilities.compliance.defaultMode. See "Privacy classification" §).

References

  • auth.md — auth model + status legend
  • rest-endpoints.md — endpoint catalog (canonical traceparent/tracestate headers)
  • idempotency.mdopenwop.activity.invoked.idempotencyHit? field
  • capabilities.mdCapabilityLimitExceededError shape (powering openwop.cap.exceeded)
  • W3C Trace Context: <https://www.w3.org/TR/trace-context/>
  • OpenTelemetry semantic conventions: <https://opentelemetry.io/docs/specs/semconv/>
  • schemas/debug-bundle.schema.json — portable diagnostic export shape