UIAP Action Runtime

UIAP Action Runtime Spec v0.1

Field	Value
Status	Draft
Version	0.1
Date	2026-03-27
Dependencies	[UIAP-CORE], [UIAP-CAP]
Editors	Patrick

1. Purpose

UIAP Action Runtime v0.1 defines how a UIAP counterpart accepts a requested Action, resolves the target, checks actionability, selects the execution mode, verifies success, and reports structured results.

The Runtime is not limited to the Web, but this document includes a normative Web binding via [email protected].

Normative Terms

The key words MUST, MUST NOT, SHOULD, MAY in this document are to be interpreted as described in RFC 2119 and BCP 14, when and only when they appear in ALL CAPS.

2. Conformance Classes

2.1 Runtime Executor

The Executor receives action.request and carries out execution.

2.2 Runtime Controller

The Controller sends action.request, action.cancel, and Confirmation responses.

2.3 Runtime Presenter

Optional. Displays Ghost Cursor, Highlights, Focus Rings, or Narration.

3. Core Principles

An Action MUST have a well-defined lifecycle.
Non-idempotent Actions MUST NOT be blindly retried.
Target resolution MUST be deterministic and verifiable.
Success MUST be verified through observable signals.
User Activation and Trusted Event boundaries MUST be explicitly modeled.
Presentation is optional and MUST NOT falsify execution outcomes.

4. Runtime States

An Action is in exactly one of these states:

RECEIVED
RESOLVING_TARGET
CHECKING_PRECONDITIONS
AWAITING_CONFIRMATION
EXECUTING
VERIFYING
WAITING_FOR_USER
RECOVERING
SUCCEEDED
FAILED
CANCELLED

5. Runtime Message Types

5.1 `action.request`

Request

interface ActionRequestPayload {
  actionId: ActionId;

  target?: ActionTarget;
  args?: Record<string, unknown>;

  preferredExecutionModes?: ExecutionMode[];
  verification?: VerificationSpec;
  presentation?: PresentationHints;

  timeoutMs?: number;
  idempotencyKey?: string;

  metadata?: Record<string, unknown>;
}

interface ActionTarget {
  ref?: TargetRef;

  expectedRole?: UIRole;
  expectedName?: string;
  expectedScopeId?: ScopeId;
  expectedDocumentId?: DocumentId;

  allowAmbiguous?: false;       // default false
}

interface PresentationHints {
  ghostCursor?: boolean;
  highlight?: "none" | "outline" | "spotlight";
  narration?: string;
  pace?: "instant" | "humanized";
}

Response: `action.accepted`

interface ActionAcceptedPayload {
  actionHandle: string;
  actionId: ActionId;
  status: "accepted";
}

Rules

A valid action.request MUST be answered with action.accepted or error.
actionHandle MUST be unique within the session.

5.2 `action.progress`

interface ActionProgressPayload {
  actionHandle: string;
  stage:
    | "resolving_target"
    | "checking_preconditions"
    | "awaiting_confirmation"
    | "executing"
    | "verifying"
    | "waiting_for_user"
    | "recovering";

  chosenExecutionMode?: ExecutionMode;
  resolvedTarget?: ResolvedTarget;
  note?: string;
  detail?: Record<string, unknown>;
}

5.3 `action.confirmation.request`

interface ActionConfirmationRequestPayload {
  actionHandle: string;
  actionId: ActionId;

  risk: RiskDescriptor;
  preview?: {
    summary?: string;
    target?: ResolvedTarget;
    args?: Record<string, unknown>;
  };
}

5.4 `action.confirmation.grant`

interface ActionConfirmationGrantPayload {
  actionHandle: string;
}

5.5 `action.confirmation.deny`

interface ActionConfirmationDenyPayload {
  actionHandle: string;
  reason?: string;
}

Rules

When risk.level="confirm" or policy requires it, the Executor MUST send action.confirmation.request and pause.
Without action.confirmation.grant, the Action MUST NOT proceed.
A denial MUST result in action.result.status="cancelled".

5.6 `action.cancel`

interface ActionCancelPayload {
  actionHandle: string;
  reason?: string;
}

5.7 `action.cancelled`

interface ActionCancelledPayload {
  actionHandle: string;
  status: "cancelled";
  reason?: string;
}

5.8 `action.result`

interface ActionResultPayload {
  actionHandle: string;
  actionId: ActionId;

  status: "succeeded" | "failed" | "cancelled";

  chosenExecutionMode?: ExecutionMode;
  resolvedTarget?: ResolvedTarget;

  verification: VerificationOutcome;
  sideEffectState?: "none" | "applied" | "unknown";

  stateRevision?: RevisionId;
  returnValue?: Record<string, unknown>;
  error?: RuntimeErrorDescriptor;

  metadata?: Record<string, unknown>;
}

6. Runtime Helper Types

6.1 ResolvedTarget

interface ResolvedTarget {
  by: "stableId" | "instanceId" | "semantic" | "annotation" | "runtimeHint";
  instanceId: ElementInstanceId;
  stableId?: StableId;
  documentId: DocumentId;
  scopeId?: ScopeId;
  role: UIRole;
  name?: string;
  bbox?: DOMRectLike;
}

6.2 VerificationSpec

interface VerificationSpec {
  policy?: "capability-default" | "any" | "all" | "none";
  signals?: SuccessSignal[];
  timeoutMs?: number;
  requireRevisionAdvance?: boolean;
}

6.3 VerificationOutcome

interface VerificationOutcome {
  passed: boolean;
  policy: "capability-default" | "any" | "all" | "none";
  observed: SuccessSignal[];
  missing?: SuccessSignal[];
  timeoutMs?: number;
}

6.4 RuntimeErrorDescriptor

type RuntimeErrorCode =
  | "action_unsupported"
  | "target_required"
  | "target_not_found"
  | "target_ambiguous"
  | "stale_target"
  | "target_not_interactable"
  | "confirmation_denied"
  | "user_activation_required"
  | "cross_origin_unavailable"
  | "closed_shadow_unavailable"
  | "execution_mode_unavailable"
  | "verification_failed"
  | "unsafe_retry_refused"
  | "cancelled"
  | "internal_runtime_error";

interface RuntimeErrorDescriptor {
  code: RuntimeErrorCode;
  message: string;
  retryable?: boolean;
  detail?: Record<string, unknown>;
}

7. Recommended Execution Order

The Runtime SHOULD use the following mode fallback by default:

appAction
semanticUi
externalDriver
inputSynthesis
visionAssist

Rationale

appAction is the most stable from a domain perspective.
semanticUi leverages existing Web semantics.
externalDriver can provide real browser-level control.
inputSynthesis in the page context is often only partially trustworthy.
visionAssist is expensive and more fragile.

The specific order MAY be adjusted per app or policy.

8. Target Resolution

8.1 Resolution Order

An Executor MUST resolve targets in this order:

TargetRef.by = "stableId"
TargetRef.by = "semantic" within expected scopes/documents
App-specific annotations such as meaning or defaultAction
Local runtime hints (css, xpath) only as a last technical fallback

8.2 Disambiguation

When multiple candidates are found, the Executor MUST compute a score based on:

Scope match
Role match
Name match
StableId match
Visual proximity to the current focus
Declared default action

If no clear winner emerges, target_ambiguous MUST be returned.

8.3 Stale Targets

If an already-resolved target is no longer attached or no longer consistent before execution, the Runtime MUST re-resolve once. If that also fails, stale_target or target_not_found MUST be returned.

9. Precondition and Actionability Checks

Playwright does not treat robust interaction as “just click on it” but checks states such as visibility, stability, event receivability, and enabled status before performing actions. UIAP explicitly adopts this mental model because browser UIs would otherwise start acting like offended prima donnas on every re-render. scrollIntoView() is the standard mechanism to bring targets into view. ([Playwright][4])

Before execution, the Executor MUST check at minimum:

9.1 General

Session is active
Action is supported
Policy permits the Action
Target is present
Target does not belong to an opaque region

9.2 For Pointer-like Actions (`ui.activate`, `ui.toggle`, `ui.choose`)

attached = true
visible = true
enabled != false
blocked != true
stable != false
obscured != true or a safe recovery exists
Target is in the viewport or can be scrolled into view

9.3 For Text Input (`ui.enterText`, `ui.clearText`, `ui.setValue`)

editable = true or role is textually editable
readonly != true
enabled != false
Target can be focused or is already focused

9.4 For `nav.navigate`

Target route or link must be present or directly executable

If a check fails, either recovery MUST be attempted or an error MUST be returned.

10. Execution Modes

10.1 `appAction`

Direct domain-level execution via an App Registry.

Rules

SHOULD be preferred when available.
MUST respect the domain-level semantics of the Action.
MAY be executed without a concrete DOM target if the Action permits it.

10.2 `semanticUi`

Semantic execution in the Web context.

Rules

SHOULD prefer platform methods such as focus(), click(), scrollIntoView(), or control-specific setters.
SHOULD only produce synthetic event sequences when no semantically richer method exists.
MUST NOT rely on raw dispatchEvent() tricks as the default approach.

HTMLElement.click() simulates a click and fires the click event, provided the element is not disabled. Events created via dispatchEvent(), however, are not isTrusted, and features requiring activation demand trusted triggering input events according to the platform. Therefore, the Runtime MUST treat user_activation_required as a first-class case instead of pretending that JavaScript alone can conjure a genuine user interaction. ([MDN Web Docs][5])

10.3 `externalDriver`

Execution via external browser control.

Rules

MAY be used when an out-of-process driver is available.
MUST adhere to the same Action lifecycle and Verification rules.
WebDriver BiDi or equivalent mechanisms ARE permitted implementations.

10.4 `inputSynthesis`

Low-level input synthesis.

Rules

MAY be used when semanticUi is insufficient.
An in-page Executor MUST treat this mode as not user-activation-safe.
This mode SHOULD NOT be used for security-critical or browser-gated flows.

10.5 `visionAssist`

Visual fallback.

Rules

MAY only be used when other modes are unavailable or unsuccessful.
MUST document the chosen target point and subsequent verification.

11. Semantics of Primitive Actions

11.1 `ui.read`

Reads the state, text, value, or description of a target.

Result

returnValue SHOULD carry the read state or content.

11.2 `ui.focus`

Focuses the target.

Success

focus.target == instanceId

11.3 `ui.highlight`

Presentation-only action.

Success

Need not change the Web state.
verification.policy="none" is permissible.

11.4 `ui.activate`

Activates a target.

Web Binding

Prefer appAction
otherwise semanticUi with scrollIntoView() and activation
other modes only when necessary

Success

capability-default or requested Success Signals

11.5 `ui.enterText`

Writes text into an editable field.

Web Binding

Focus the target
Optionally clear the existing value
Set the value
Trigger app-compatible input semantics
Verify the resulting value

Success

value.equals or element.state.textValue

11.6 `ui.clearText`

Sets textual content to empty.

11.7 `ui.choose`

Selects an option in combobox, listbox, select, or similar controls.

11.8 `ui.toggle`

Changes a binary or tri-state condition.

11.9 `ui.expand` / `ui.collapse`

Opens or closes expandable targets.

11.10 `ui.open` / `ui.close`

Opens or closes modal or temporary surfaces.

11.11 `ui.scrollIntoView`

Brings a target into the visible area.

11.12 `ui.scroll`

Scrolls an element or the viewport.

11.13 `ui.submit`

Completes a form-like interaction.

11.14 `nav.navigate`

Changes the route or URL.

11.15 `app.invoke`

Executes a declared Domain Action.

12. Verification

12.1 Default Behavior

When verification is not set, the Runtime MUST use:

success signals from the ActionDescriptor, if present
otherwise success signals of the target, if present
otherwise a profile-typical minimal verification

12.2 Minimal Verification on the Web

For the Web, minimal verification SHOULD look like this:

For ui.focus: Focus is on the target
For ui.enterText: Target value matches the expected value
For ui.activate: At least one plausible state change, e.g., route change, dialog transition, toast, state delta
For nav.navigate: Route or URL has changed
For ui.highlight: No state verification required

12.3 Verification Policies

none: No formal verification
any: At least one signal must occur
all: All signals must occur
capability-default: App- or target-side defaults

12.4 Revision Advance

When requireRevisionAdvance=true, at least one new PageGraph revision MUST be observed before success is declared.

13. Recovery

The Runtime MAY attempt recovery but MUST remain cautious.

13.1 Permissible Automatic Recovery

Re-resolve on stale target
scrollIntoView on offscreen target
Short re-waits during transient busy/loading states
Re-verification after a documented UI update

13.2 Impermissible Automatic Recovery

Re-triggering non-idempotent Actions without reliable knowledge of the side-effect status
Skipping required confirmations
Accessing opaque cross-origin or closed shadow regions

13.3 Side-Effect Status

On failure, sideEffectState MUST be set to:

none
applied
unknown

For non-idempotent Actions, unknown MUST be treated as a hard warning signal.

14. User Handoff and Activation

The Web intentionally restricts APIs that could cause poor user experiences or abuse; some features only work with active or previously completed User Activation, and the triggering input events must be trusted. Therefore, UIAP MUST be able to cleanly pause an Action and report waiting_for_user or user_activation_required, rather than trying to circumvent browser security boundaries with script trickery. ([MDN Web Docs][2])

When an Action fails due to this, the Executor MUST:

Send action.progress with stage="waiting_for_user",
Publish a comprehensible note text,
RESUME after actual user interaction or cleanly FAIL.

15. Runtime Error Rules

15.1 `target_not_found`

No matching target found.

15.2 `target_ambiguous`

Multiple plausible targets, no clear winner.

15.3 `target_not_interactable`

Target exists but is not actionable.

15.4 `user_activation_required`

Execution requires genuine user interaction.

15.5 `cross_origin_unavailable`

Target resides in an opaque cross-origin region.

15.6 `closed_shadow_unavailable`

Target resides in a closed Shadow Root without a bridge.

15.7 `verification_failed`

Execution ran but expected signals did not occur.

15.8 `unsafe_retry_refused`

Retry would be impermissible for domain or security reasons.

16. Minimal Runtime Conformance Scope

A conforming Runtime Executor MUST:

Accept action.request,
Support target resolution from stableId and semantic targets,
Perform precondition checks,
Support at least appAction or semanticUi,
Support Confirmation Flows,
Verify Success Signals,
Send action.result with verification and optionally error.

17. Example: `ui.activate` on a Submit Button

Request

{
  "uiap": "0.1",
  "kind": "request",
  "type": "action.request",
  "id": "msg_77",
  "sessionId": "sess_123",
  "ts": "2026-03-26T14:03:00.000Z",
  "source": { "role": "agent", "id": "onboarding-agent" },
  "payload": {
    "actionId": "ui.activate",
    "target": {
      "ref": { "by": "stableId", "value": "video.submit" },
      "expectedRole": "button",
      "expectedName": "Video erstellen"
    },
    "verification": {
      "policy": "all",
      "signals": [
        { "kind": "route.changed", "pattern": "/videos/:id" },
        { "kind": "toast.contains", "text": "erstellt" }
      ],
      "timeoutMs": 8000,
      "requireRevisionAdvance": true
    },
    "presentation": {
      "ghostCursor": true,
      "highlight": "spotlight",
      "pace": "humanized"
    },
    "timeoutMs": 12000
  }
}

Immediate Response

{
  "uiap": "0.1",
  "kind": "response",
  "type": "action.accepted",
  "id": "msg_78",
  "correlationId": "msg_77",
  "sessionId": "sess_123",
  "ts": "2026-03-26T14:03:00.015Z",
  "source": { "role": "bridge", "id": "web-runtime" },
  "payload": {
    "actionHandle": "act_991",
    "actionId": "ui.activate",
    "status": "accepted"
  }
}

Progress

{
  "uiap": "0.1",
  "kind": "event",
  "type": "action.progress",
  "id": "msg_79",
  "sessionId": "sess_123",
  "ts": "2026-03-26T14:03:00.020Z",
  "source": { "role": "bridge", "id": "web-runtime" },
  "payload": {
    "actionHandle": "act_991",
    "stage": "resolving_target",
    "resolvedTarget": {
      "by": "stableId",
      "instanceId": "el_submit",
      "stableId": "video.submit",
      "documentId": "doc_root",
      "scopeId": "scope_form",
      "role": "button",
      "name": "Video erstellen",
      "bbox": { "x": 120, "y": 420, "width": 180, "height": 40 }
    }
  }
}

Result

{
  "uiap": "0.1",
  "kind": "event",
  "type": "action.result",
  "id": "msg_80",
  "sessionId": "sess_123",
  "ts": "2026-03-26T14:03:01.220Z",
  "source": { "role": "bridge", "id": "web-runtime" },
  "payload": {
    "actionHandle": "act_991",
    "actionId": "ui.activate",
    "status": "succeeded",
    "chosenExecutionMode": "semanticUi",
    "resolvedTarget": {
      "by": "stableId",
      "instanceId": "el_submit",
      "stableId": "video.submit",
      "documentId": "doc_root",
      "scopeId": "scope_form",
      "role": "button",
      "name": "Video erstellen"
    },
    "verification": {
      "passed": true,
      "policy": "all",
      "observed": [
        { "kind": "route.changed", "pattern": "/videos/:id" },
        { "kind": "toast.contains", "text": "erstellt" }
      ],
      "timeoutMs": 8000
    },
    "sideEffectState": "applied",
    "stateRevision": "rev_20"
  }
}

Normative References

[UIAP-CORE] UIAP Core v0.1
[UIAP-CAP] UIAP Capability Model v0.1
[RFC2119] Key words for use in RFCs to Indicate Requirement Levels, BCP 14

Security Considerations

Action handlers MUST validate input parameters before executing domain operations.
sideEffectState MUST be reported correctly to make unintended state changes traceable.
Non-idempotent Actions SHOULD NOT perform automatic retries.
The Confirmation flow MUST be tamper-proof; a granted response MUST NOT originate from an unauthorized source.

Changelog

Version	Date	Changes
0.1	2026-03-27	Initial draft

UIAP Action Runtime

UIAP Action Runtime Spec v0.1

1. Purpose

Normative Terms

2. Conformance Classes

2.1 Runtime Executor

2.2 Runtime Controller

2.3 Runtime Presenter

3. Core Principles

4. Runtime States

5. Runtime Message Types

5.1 action.request

Request

Response: action.accepted

Rules

5.2 action.progress

5.3 action.confirmation.request

5.4 action.confirmation.grant

5.5 action.confirmation.deny

Rules

5.6 action.cancel

5.7 action.cancelled

5.8 action.result

6. Runtime Helper Types

6.1 ResolvedTarget

6.2 VerificationSpec

6.3 VerificationOutcome

6.4 RuntimeErrorDescriptor

7. Recommended Execution Order

Rationale

8. Target Resolution

8.1 Resolution Order

8.2 Disambiguation

8.3 Stale Targets

9. Precondition and Actionability Checks

9.1 General

9.2 For Pointer-like Actions (ui.activate, ui.toggle, ui.choose)

9.3 For Text Input (ui.enterText, ui.clearText, ui.setValue)

9.4 For nav.navigate

10. Execution Modes

10.1 appAction

Rules

10.2 semanticUi

Rules

10.3 externalDriver

Rules

10.4 inputSynthesis

Rules

10.5 visionAssist

Rules

11. Semantics of Primitive Actions

11.1 ui.read

Result

11.2 ui.focus

Success

11.3 ui.highlight

Success

11.4 ui.activate

Web Binding

Success

11.5 ui.enterText

Web Binding

Success

11.6 ui.clearText

11.7 ui.choose

11.8 ui.toggle

11.9 ui.expand / ui.collapse

11.10 ui.open / ui.close

11.11 ui.scrollIntoView

11.12 ui.scroll

11.13 ui.submit

11.14 nav.navigate

11.15 app.invoke

12. Verification

12.1 Default Behavior

12.2 Minimal Verification on the Web

12.3 Verification Policies

12.4 Revision Advance

13. Recovery

13.1 Permissible Automatic Recovery

5.1 `action.request`

Response: `action.accepted`

5.2 `action.progress`

5.3 `action.confirmation.request`

5.4 `action.confirmation.grant`

5.5 `action.confirmation.deny`

5.6 `action.cancel`

5.7 `action.cancelled`

5.8 `action.result`

9.2 For Pointer-like Actions (`ui.activate`, `ui.toggle`, `ui.choose`)

9.3 For Text Input (`ui.enterText`, `ui.clearText`, `ui.setValue`)

9.4 For `nav.navigate`

10.1 `appAction`

10.2 `semanticUi`

10.3 `externalDriver`

10.4 `inputSynthesis`

10.5 `visionAssist`

11.1 `ui.read`

11.2 `ui.focus`

11.3 `ui.highlight`

11.4 `ui.activate`

11.5 `ui.enterText`

11.6 `ui.clearText`

11.7 `ui.choose`

11.8 `ui.toggle`

11.9 `ui.expand` / `ui.collapse`

11.10 `ui.open` / `ui.close`

11.11 `ui.scrollIntoView`

11.12 `ui.scroll`

11.13 `ui.submit`

11.14 `nav.navigate`

11.15 `app.invoke`

15.1 `target_not_found`

15.2 `target_ambiguous`

15.3 `target_not_interactable`

15.4 `user_activation_required`

15.5 `cross_origin_unavailable`

15.6 `closed_shadow_unavailable`

15.7 `verification_failed`

15.8 `unsafe_retry_refused`

17. Example: `ui.activate` on a Submit Button