Skip to content

UIAP Conformance Suite

FieldValue
StatusDraft
Version0.1
Date2026-03-27
DependenciesAll UIAP specifications
EditorsPatrick

UIAP Conformance Suite v0.1 defines how implementations of UIAP specifications are tested, evaluated and classified as conformant or non-conformant in a standardized manner.

The Conformance Suite covers these specification areas:

  • UIAP Core v0.1
  • UIAP Capability Model v0.1
  • UIAP Web Profile v0.1
  • UIAP Action Runtime Spec v0.1
  • UIAP Policy Extension v0.1
  • UIAP SDK API v0.1
  • UIAP Workflow Extension v0.1
  • UIAP Discovery Mapper Spec v0.1
  • UIAP Authoring/Manifest Spec v0.1

The suite defines:

  • Conformance classes and claims
  • Test modules and test cases
  • Harness and fixture model
  • Execution and evaluation rules
  • Result and report format
  • Composite profiles for real-world product claims

The suite does not define:

  • Product certifications by a specific organization
  • Commercial labels or trademarks
  • Model quality benchmarking
  • LLM intelligence tests in the sense of “does the agent seem cool enough”

  1. Conformance is claims-based. Only what an implementation actually claims is tested.
  2. Required tests are binding. A passed claim requires passing all associated required cases.
  3. Composite profiles build on base profiles.
  4. Negative tests are equal to positive tests.
  5. Determinism beats romance. Repeatable inputs MUST produce reproducible results, insofar as the specification demands it.
  6. Conformance measures specification adherence, not product quality.
  7. Self-asserted compatibility without suite results is not proof of conformance.

The key words MUST, MUST NOT, SHOULD, MAY in this document are to be interpreted as described in RFC 2119 and BCP 14, when and only when they appear in ALL CAPS.


An implementation MAY claim one or more of these roles:

type ConformanceRole =
| "CoreEndpoint"
| "CapabilityProvider"
| "WebPublisher"
| "WebConsumer"
| "RuntimeExecutor"
| "RuntimeController"
| "PolicyEvaluator"
| "WebSDK"
| "WorkflowEngine"
| "DiscoveryMapper"
| "AuthoringCompiler";
  • CoreEndpoint: speaks UIAP Core correctly
  • CapabilityProvider: delivers capability documents
  • WebPublisher: publishes PageGraph, deltas and web signals
  • WebConsumer: consumes web states correctly
  • RuntimeExecutor: executes actions according to the Runtime Spec
  • RuntimeController: controls actions, confirmations, cancel
  • PolicyEvaluator: delivers and enforces policy decisions
  • WebSDK: implements the SDK API in a browser/host context
  • WorkflowEngine: executes workflows
  • DiscoveryMapper: explores apps in a controlled manner and produces discovery packages
  • AuthoringCompiler: compiles authoring manifests into normalized bundles

Every implementation under test MUST declare its claims in a machine-readable format.

interface ConformanceClaimSet {
suiteVersion: "0.1";
implementation: ImplementationDescriptor;
claims: ProfileClaim[];
}
interface ImplementationDescriptor {
vendor: string;
product: string;
version: string;
buildId?: string;
components: ImplementedComponent[];
metadata?: Record<string, unknown>;
}
interface ImplementedComponent {
id: string;
roles: ConformanceRole[];
uiapCore?: string;
profiles?: string[];
extensions?: Array<{
id: string;
version: string;
}>;
metadata?: Record<string, unknown>;
}
interface ProfileClaim {
componentId: string;
profileId: ConformanceProfileId;
}

No dependencies.

Depends on:

Depends on:

Depends on:

Depends on:

Depends on:

Depends on:

Depends on:

Depends on:

No runtime dependency, but package/bundle conformance.

Depends on:

Depends on all profiles.


type TestStatus =
| "pass"
| "fail"
| "skip"
| "warn"
| "invalid";
  • pass: test fulfills all normative assertions
  • fail: test violates at least one normative assertion
  • skip: test is not applicable for the claim or depends on an unclaimed optional capability
  • warn: test case is not required or is informational only, but indicates a relevant issue
  • invalid: test could not be correctly evaluated due to a harness, fixture or environment error

A profile is considered passed when:

  1. all required test cases of the profile and its dependencies have status pass,
  2. no required test has status fail or invalid.

A profile is considered failed when:

  • at least one required test is fail or invalid.

A profile MAY still pass despite warn.


type TestSeverity =
| "required"
| "optional"
| "informational";
  • required: determines conformance
  • optional: enhances the expressiveness of results but does not block base conformance
  • informational: provides metrics or hints but has no pass/fail effect on profiles

type TestMethod =
| "blackbox"
| "graybox"
| "whitebox";
  • blackbox: verification only through the public protocol/API surface
  • graybox: verification with insight into declared artifacts, bundles or SDK hooks
  • whitebox: verification with internal instrumentation

For UIAP conformance, the emphasis SHOULD be on blackbox and graybox. Whitebox tests MAY supplement but MUST NOT replace base conformance.


A conformance harness is the system that executes tests against the implementation under test.

interface ConformanceHarness {
loadClaims(claims: ConformanceClaimSet): Promise<void>;
prepareFixtures(fixtures: FixtureRef[]): Promise<void>;
runTest(testId: string): Promise<TestCaseResult>;
runProfile(profileId: ConformanceProfileId): Promise<ProfileResult>;
}

A conformant harness SHOULD have these functional parts:

  • Transport Adapter for UIAP messages
  • Fixture Loader for web app, bundle and discovery fixtures
  • Oracle/Assertion Engine
  • Artifact Collector
  • Result Aggregator
  • Clock/Timeout Controller

interface FixtureRef {
id: string;
type:
| "message-sequence"
| "web-app"
| "compiled-bundle"
| "policy-context"
| "workflow-catalog"
| "discovery-environment"
| "authoring-package";
version: string;
metadata?: Record<string, unknown>;
}

The suite SHOULD define at least these reference fixtures:


The harness MUST be able to store evidence for test decisions.

type ArtifactType =
| "message-trace"
| "snapshot"
| "delta"
| "policy-decision"
| "policy-audit"
| "workflow-history"
| "compiled-bundle"
| "discovery-package"
| "screenshot"
| "log"
| "diff";
interface ArtifactRef {
id: string;
type: ArtifactType;
uri?: string;
note?: string;
}
  • Every fail or invalid case MUST have at least one referenceable artifact.
  • message-trace SHOULD be required for core, runtime and workflow cases.
  • compiled-bundle SHOULD be required for authoring cases.
  • discovery-package SHOULD be required for discovery cases.

interface TestCaseDefinition {
id: string;
moduleId: string;
title: string;
targetRoles: ConformanceRole[];
severity: TestSeverity;
method: TestMethod;
prerequisites?: string[];
fixtures?: string[];
purpose: string;
assertions: AssertionDefinition[];
metadata?: Record<string, unknown>;
}
interface AssertionDefinition {
id: string;
description: string;
required: boolean;
}
interface TestCaseResult {
id: string;
status: TestStatus;
durationMs: number;
assertions: AssertionResult[];
artifacts?: ArtifactRef[];
notes?: string[];
}
interface AssertionResult {
id: string;
status: "pass" | "fail" | "skip";
message?: string;
}

type ConformanceModuleId =
| "core"
| "capabilities"
| "web.profile"
| "action.runtime"
| "policy"
| "sdk.web"
| "workflow"
| "discovery"
| "authoring";

Goal: Envelope required fields are correctly validated. Assertions:

  • Envelope without uiap is rejected
  • Envelope without kind, type, id, ts, source, payload is rejected
  • payload MUST be an object

Goal: Response and error correlation. Assertions:

  • response MUST carry correlationId
  • error MUST carry correlationId

Goal: Successful session handshake. Assertions:

  • session.initialize leads to session.initialized
  • selectedVersion comes from the offered version set
  • Session becomes active

Goal: Error on incompatible version. Assertions:

  • Handshake ends with unsupported_version

Goal: Error on missing required extension. Assertions:

  • Missing required extension leads to unsupported_extension or unsupported_profile

Goal: session.ping / session.pong. Assertions:

  • pong correlates semantically to the ping
  • Session remains active

Goal: Orderly termination. Assertions:

  • session.terminate leads to session.terminated
  • No further normal messages are processed afterwards

Goal: capabilities.get / capabilities.list. Assertions:

  • Capability response is formally valid
  • capabilities.list contains capabilities

Goal: Error object and error codes. Assertions:

  • error.payload.code is set
  • message is set

Goal: Unknown message type. Assertions:

  • Unknown type returns unknown_message_type

Goal: Negotiated version remains binding. Assertions:

  • After handshake, messages carry the selected version

Goal: Capability Document is formally valid. Assertions:

  • modelVersion, profile, roles, stateKeys, affordances, actions, riskLevels present

Goal: Action IDs are unique. Assertions:

  • No duplicate action.id

Goal: Used success signal kinds are declared. Assertions:

  • All success.kind values appear in successSignalKinds

Goal: Risk levels are valid. Assertions:

  • Only safe, confirm, blocked, unless a documented extension exists

Goal: Consistency between role, affordances and actions. Example: textbox with ui.enterText, button with ui.activate.


Goal: Snapshot basic structure. Assertions:

  • PageGraph contains revision, rootDocumentId, documents, elements, viewport

Goal: Revisions increase monotonically. Assertions:

  • Later snapshots/deltas do not have a backward-running revision

Goal: Deltas reference a known base revision. Assertions:

  • baseRevision points to a known previous revision

Goal: Cross-origin opaque frames are correct. Assertions:

  • Opaque frames are marked as access="opaque"
  • No inner elements are falsely published

Goal: Closed Shadow DOM is handled correctly. Assertions:

  • Closed shadow contents are not published as freely accessible elements

Goal: Stable IDs take precedence. Assertions:

  • When stableId is present, it remains the canonical target identity

Goal: Core signals are publishable. Assertions:

  • At least route, dialog or status/toast signals are emitted correctly

Goal: Focus state is consistent. Assertions:

  • focus.target points to an existing element or is empty

Goal: Open Shadow DOM is modeled correctly.

Goal: Bridged cross-origin frames are correctly published as bridged.


Goal: Valid action requests are accepted or cleanly rejected. Assertions:

  • action.request leads to action.accepted or error

Goal: Target resolution via stableId. Assertions:

  • Unique stableId is resolved correctly

Goal: Ambiguity is not concealed. Assertions:

  • Ambiguous targets lead to target_ambiguous

Goal: Actionability check. Assertions:

  • Non-interactable targets lead to target_not_interactable or documented recovery

Goal: Confirmation flow. Assertions:

  • confirm requirement leads to action.confirmation.request
  • Without grant, no execution occurs
  • deny leads to cancelled or failed with an appropriate cause

Goal: Cancel behavior. Assertions:

  • action.cancel cleanly terminates an active action

Goal: Verification policy all / any / none. Assertions:

  • all requires all signals
  • any requires at least one
  • none requires no signal verification

Goal: User activation cases are handled correctly. Assertions:

  • An action requiring activation leads to waiting_for_user, handoff or user_activation_required

Goal: No unsafe retry. Assertions:

  • For non-idempotent actions with sideEffectState="unknown", no silent retry occurs

Goal: Action result is complete. Assertions:

  • status, verification, optionally resolvedTarget, chosenExecutionMode present

Goal: Policy document can be delivered. Assertions:

  • uiap.policy.get leads to a valid uiap.policy.document

Goal: Safe action with sufficient grant. Assertions:

  • Result is allow

Goal: Confirm risk. Assertions:

  • confirm risk leads to confirm, unless a stricter rule applies

Goal: Secret/credential without grant. Assertions:

  • Result is deny or handoff per policy, but never silent allow

Goal: Redaction rules take effect. Assertions:

  • Fields subject to redaction are masked in affected outputs

Goal: Audit behavior. Assertions:

  • For an audit-required decision, a valid audit record is produced

Goal: User activation / human actor rule. Assertions:

  • With a matching obligation, handoff or equivalent non-autonomy is enforced

Goal: Stricter rule wins. Assertions:

  • Conflicting rules are deterministically evaluated in favor of the stricter effect

Goal: Lifecycle methods exist and function. Assertions:

  • start, stop, destroy are usable

Goal: bindElement / bindScope shape the snapshot. Assertions:

  • Bound IDs appear in the published state

Goal: Registered actions are executable. Assertions:

  • registerAction makes the action available
  • Runtime prefers the registered app action when appropriate

Goal: Policy hook takes effect. Assertions:

  • A local deny decision prevents execution

Goal: Signal emission. Assertions:

  • emitSignal publishes a valid web signal

Goal: Frame bridge validates channel and origin. Assertions:

  • Messages from an incorrect origin or incorrect channel are discarded

Goal: Overlay is purely presentational. Assertions:

  • Overlay success does not replace action.result

Goal: Workflow catalog is formally valid. Assertions:

  • initialStepId exists
  • Step IDs are unique

Goal: Starting a valid workflow. Assertions:

  • uiap.workflow.start leads to uiap.workflow.started

Goal: Missing inputs are handled correctly. Assertions:

  • collect leads to uiap.workflow.input.request when values are missing
  • input.provide is correctly accepted or rejected

Goal: Action step delegates to runtime. Assertions:

  • Workflow action leads internally to runtime execution
  • Result flows into workflow history

Goal: Policy integration. Assertions:

  • confirm leads to waiting_confirmation
  • handoff leads to waiting_user
  • deny leads to failed or defined recovery

Goal: Checkpoint/resume. Assertions:

  • Checkpoints are created at permitted points
  • resume continues correctly

Goal: Explicit handoff step. Assertions:

  • handoff step sets waiting_user

Goal: Recovery rules. Assertions:

  • Defined onError strategy is applied

Goal: Workflow completion. Assertions:

  • complete plus global success criteria lead to uiap.workflow.result

Goal: Planning phase is formally correct. Assertions:

  • Discovery plan accepts valid seeds and environment

Goal: Lifecycle from start to result. Assertions:

  • start leads to progress and result or a clean error

Goal: At least one discovery state per seed. Assertions:

  • Initial state contains PageGraph and fingerprint

Goal: Deduplication of equivalent states. Assertions:

  • Repeatedly visited identical states do not generate arbitrarily many new state nodes

Goal: Safety policy is respected. Assertions:

  • Forbidden or destructive actions are not executed autonomously

Goal: Core catalogs are present. Assertions:

  • Route, scope, element and action catalogs are produced

Goal: Candidates carry evidence and confidence. Assertions:

  • Every candidate has discoveredBy and confidence

Goal: Review queue. Assertions:

  • Ambiguities, opaque boundaries or missing success signals produce review items

Goal: Workflow candidates are synthesized.


Goal: Package manifest is valid. Assertions:

  • Exactly one package
  • Exactly one effective app manifest after resolution

Goal: Imports and aliases. Assertions:

  • Aliases are unique
  • Imported IDs are correctly namespaced

Goal: Overlay application is deterministic. Assertions:

  • Identical inputs and build context produce identical results
  • Conflicts are not silently hidden

Goal: Locale references are resolved. Assertions:

  • Missing required texts are marked as errors or review blocks

Goal: Review gates take effect. Assertions:

  • prod bundle does not violate defined review rules

Goal: Bundle compilation. Assertions:

  • Compiled bundle contains no unresolved imports/overlays
  • Build is reproducible

Goal: Conflict handling. Assertions:

  • Duplicate IDs or incompatible overlays lead to an error or explicit review warning

Goal: Discovery imports are explicit. Assertions:

  • Discovery content is only incorporated into authoring via acceptance rules

This profile is passed when the following profiles are passed:

This profile is passed when all base profiles are passed:

  • Core
  • Capabilities
  • Web Publisher
  • Web Consumer
  • Action Runtime
  • Policy
  • SDK Web
  • Workflow
  • Discovery
  • Authoring

A failed required test MAY be automatically retried exactly once if the first run was invalid due to an obviously external environment disturbance.

A test that fluctuates between pass and fail without changes MUST be marked as invalid until the cause is resolved.

Every test SHOULD have a timeout. Timeout violations in required cases lead to fail or invalid, depending on the cause.

For authoring, policy and certain discovery cases, reproducibility MUST be verified when identical inputs and identical context are present.


interface ConformanceReport {
suiteVersion: "0.1";
generatedAt: string;
implementation: ImplementationDescriptor;
claims: ProfileClaim[];
environment: ConformanceEnvironment;
results: ProfileResult[];
summary: ConformanceSummary;
artifacts?: ArtifactRef[];
}
interface ConformanceEnvironment {
harnessVersion: string;
runnerId?: string;
os?: string;
browser?: string;
nodeVersion?: string;
metadata?: Record<string, unknown>;
}
interface ProfileResult {
profileId: ConformanceProfileId;
status: "pass" | "fail" | "invalid";
testResults: TestCaseResult[];
}
interface ConformanceSummary {
passedProfiles: string[];
failedProfiles: string[];
invalidProfiles: string[];
passCount: number;
failCount: number;
skipCount: number;
warnCount: number;
invalidCount: number;
}
  • Every claimed claim MUST appear in the report.
  • failedProfiles MUST also contain profiles that fail solely due to dependency violations.
  • Reports SHOULD be machine-readably serializable.

A reference conformance suite v0.1 MUST provide at least:

  • Profile definitions
  • Test case catalog
  • Reference fixtures
  • Report schema
  • Result aggregation
  • Documentation for harness execution

It SHOULD additionally provide:

  • Reference harness
  • CI-compatible runner
  • Golden artifacts
  • Example claims
  • Example reports

{
"suiteVersion": "0.1",
"implementation": {
"vendor": "Acme",
"product": "Acme Web Agent Host",
"version": "1.2.0",
"components": [
{
"id": "host",
"roles": [
"CoreEndpoint",
"CapabilityProvider",
"WebPublisher",
"RuntimeExecutor",
"PolicyEvaluator",
"WebSDK"
],
"uiapCore": "0.1",
"profiles": ["[email protected]"],
"extensions": [
{ "id": "uiap.policy", "version": "0.1" }
]
}
]
},
"claims": [
{ "componentId": "host", "profileId": "[email protected]" },
{ "componentId": "host", "profileId": "[email protected]" },
{ "componentId": "host", "profileId": "[email protected]" },
{ "componentId": "host", "profileId": "[email protected]" },
{ "componentId": "host", "profileId": "[email protected]" },
{ "componentId": "host", "profileId": "[email protected]" },
{ "componentId": "host", "profileId": "[email protected]" }
]
}

{
"suiteVersion": "0.1",
"generatedAt": "2026-03-26T18:20:00Z",
"implementation": {
"vendor": "Acme",
"product": "Acme Web Agent Host",
"version": "1.2.0",
"components": [
{
"id": "host",
"roles": [
"CoreEndpoint",
"CapabilityProvider",
"WebPublisher",
"RuntimeExecutor",
"PolicyEvaluator",
"WebSDK"
]
}
]
},
"claims": [
{ "componentId": "host", "profileId": "[email protected]" }
],
"environment": {
"harnessVersion": "0.1.0",
"browser": "Chromium 126"
},
"results": [
{
"profileId": "[email protected]",
"status": "pass",
"testResults": [
{
"id": "CORE-SES-001",
"status": "pass",
"durationMs": 48,
"assertions": [
{ "id": "a1", "status": "pass" },
{ "id": "a2", "status": "pass" }
]
}
]
},
{
"profileId": "[email protected]",
"status": "fail",
"testResults": [
{
"id": "SDK-FRAME-001",
"status": "fail",
"durationMs": 33,
"assertions": [
{
"id": "a1",
"status": "fail",
"message": "Message with incorrect origin was processed."
}
],
"artifacts": [
{
"id": "art_17",
"type": "message-trace",
"note": "Frame bridge accepted untrusted origin"
}
]
}
]
}
],
"summary": {
"passedProfiles": ["[email protected]"],
"failedProfiles": ["[email protected]"],
"invalidProfiles": [],
"passCount": 11,
"failCount": 1,
"skipCount": 3,
"warnCount": 0,
"invalidCount": 0
}
}

An implementation MAY only publicly claim:

If only partial areas are passed, the statement MUST be precise. In other words, do not claim “we are UIAP-compatible” when in reality only the envelope parsing works and the rest consists of wishful thinking.


Following this spec, three things are practically still needed to make this immediately production-ready:

  1. Reference Fixture Pack v0.1 with demo web app, policy cases, workflow cases, discovery environment and authoring sample pack.

  2. Conformance Report JSON Schema v0.1 so that reports are machine-validatable.

  3. CI Runner Spec v0.1 so that teams can automatically verify claims in CI and block bundle/SDK/runtime releases.


  • [UIAP-CORE] UIAP Core v0.1
  • [UIAP-CAP] UIAP Capability Model v0.1
  • [UIAP-WEB] UIAP Web Profile v0.1
  • [UIAP-ACTION] UIAP Action Runtime v0.1
  • [UIAP-POLICY] UIAP Policy Extension v0.1
  • [UIAP-SDK] UIAP SDK API v0.1
  • [UIAP-WORKFLOW] UIAP Workflow Extension v0.1
  • [UIAP-DISCOVERY] UIAP Discovery Mapper v0.1
  • [UIAP-MANIFEST] UIAP Authoring/Manifest v0.1
  • [RFC2119] Key words for use in RFCs to Indicate Requirement Levels, BCP 14
  • Conformance tests MUST NOT be executed against production systems without explicit authorization.
  • Test fixtures MAY contain sensitive patterns; test results SHOULD be redacted before publication.
  • Conformance results SHOULD be cryptographically signed when used as evidence toward third parties.
VersionDateChanges
0.12026-03-27Initial draft