Skip to content

UIAP Capability Model

FieldValue
StatusDraft
Version0.1
Date2026-03-27
Dependencies
EditorsPatrick

The Capability Model describes what an application can semantically present, observe, and execute.

It cleanly separates:

  • Roles: what a UI entity is
  • States: what state it is in
  • Affordances: which interactions it fundamentally offers
  • Actions: which standardized requests the agent can send
  • Risk: how strictly an action must be regulated
  • Success Signals: how success can be observed

This is the difference between “there is some button” and “this is video.submit, activatable, requires confirmation, success = route changes and a toast appears”.


  1. Capability values SHOULD be stable and machine-readable.
  2. Core values are reserved without prefix.
  3. Vendor-/app-specific extensions SHOULD begin with x., e.g. x.videoland.asset-card.
  4. A capability document MAY support only a subset of the core vocabulary; the actually supported values MUST be explicitly declared.
  5. Roles and affordances describe semantics, not concrete DOM structures or CSS.

type ProfileId = string; // e.g. "[email protected]"
type StableId = string; // stable, app-defined target ID
type ScopeId = string; // logical scope, e.g. "create-video-dialog"
type ActionId = string; // e.g. "ui.activate" or "video.create"
type TargetRef =
| { by: "stableId"; value: StableId }
| { by: "scope"; value: ScopeId }
| { by: "route"; value: string }
| { by: "semantic"; role: UIRole; name?: string; scope?: ScopeId }
| { by: "custom"; value: string };

UIRole is the canonical typing of visible or interactive UI objects.

type UIRole =
// App / Structure
| "app"
| "route"
| "region"
| "group"
| "form"
| "dialog"
| "drawer"
| "popover"
| "tabpanel"
// Navigation / Structured Selection
| "link"
| "menu"
| "menuitem"
| "tablist"
| "tab"
| "toolbar"
| "breadcrumb"
| "pagination"
// Collections
| "list"
| "listitem"
| "table"
| "row"
| "cell"
| "grid"
| "tree"
| "treeitem"
// Inputs / Controls
| "button"
| "textbox"
| "textarea"
| "searchbox"
| "combobox"
| "listbox"
| "option"
| "checkbox"
| "radio"
| "switch"
| "slider"
| "spinbutton"
| "datepicker"
| "timepicker"
| "fileinput"
| "label"
// Feedback / Status
| "alert"
| "toast"
| "status"
| "progress"
| "spinner"
// Media / Special Cases
| "image"
| "video"
| "audio"
| "canvas"
// Escape hatch
| "custom";
  • button: activatable command
  • textbox / textarea: editable text input
  • checkbox / radio / switch: discrete state selection
  • combobox / listbox / option: selection from candidates
  • dialog / drawer / popover: temporary UI surfaces
  • toast / alert / status: observable feedback signals
  • custom: only when no core role fits

UIState describes the observable state of an element or scope.

interface UIState {
visible?: boolean;
enabled?: boolean;
focusable?: boolean;
focused?: boolean;
editable?: boolean;
readonly?: boolean;
required?: boolean;
busy?: boolean;
loading?: boolean;
blocked?: boolean;
selected?: boolean;
checked?: boolean | "mixed";
pressed?: boolean;
expanded?: boolean;
open?: boolean;
modal?: boolean;
hovered?: boolean;
dragged?: boolean;
current?: false | "page" | "step" | "location" | "date" | "time";
invalid?: false | true | "grammar" | "spelling";
multiline?: boolean;
hasPopup?: false | "menu" | "listbox" | "dialog" | "grid" | "tree";
orientation?: "horizontal" | "vertical";
textValue?: string;
numericValue?: number;
min?: number;
max?: number;
step?: number;
placeholder?: string;
description?: string;
sensitive?: boolean;
obscured?: boolean;
}

A capability document SHOULD additionally declare which state keys are actively emitted.

type UIStateKey = keyof UIState;
  • visible=false does not automatically imply enabled=false.
  • blocked=true means: formally present but currently not usable.
  • sensitive=true marks data or controls that require special policy treatment.
  • obscured=true marks content that is covered or intentionally masked.

Affordances describe which type of interaction an element fundamentally offers.

type UIAffordance =
| "read"
| "focus"
| "activate"
| "edit"
| "choose"
| "toggle"
| "expand"
| "collapse"
| "open"
| "close"
| "scroll"
| "submit"
| "dismiss"
| "drag"
| "drop"
| "resize"
| "upload"
| "navigate"
| "invoke";
  • activate: generic activation, e.g. button, link, card
  • edit: text or value can be modified
  • choose: selection from options
  • toggle: binary or tri-state switch
  • invoke: domain-level app action beyond a simple click
  • Affordance = what an element can principally offer
  • Action = standardized request from the agent

A button typically has the activate affordance. The agent then sends e.g. ui.activate.


Actions are standardized executable operations.

type PrimitiveActionType =
| "ui.read"
| "ui.focus"
| "ui.highlight"
| "ui.hover"
| "ui.activate"
| "ui.enterText"
| "ui.clearText"
| "ui.setValue"
| "ui.choose"
| "ui.toggle"
| "ui.expand"
| "ui.collapse"
| "ui.open"
| "ui.close"
| "ui.scrollIntoView"
| "ui.scroll"
| "ui.submit"
| "ui.upload"
| "nav.navigate"
| "app.invoke";
type ExecutionMode =
| "appAction" // direct app function / registry
| "semanticUi" // semantic UI target addressing
| "inputSynthesis" // synthetic pointer/keyboard input
| "externalDriver" // external automation driver
| "visionAssist"; // vision-assisted fallback
interface ActionDescriptor {
id: ActionId; // "ui.activate" or "video.create"
kind: "primitive" | "domain";
title?: string;
description?: string;
targetKinds: Array<"element" | "scope" | "route" | "entity" | "session" | "none">;
requiredAffordances?: UIAffordance[];
executionModes: ExecutionMode[];
args?: ActionArgDescriptor[];
idempotency?: "idempotent" | "conditional" | "non_idempotent";
risk: RiskDescriptor;
success?: SuccessSignal[];
metadata?: Record<string, unknown>;
}
interface ActionArgDescriptor {
name: string;
type: "string" | "number" | "boolean" | "enum" | "object" | "array";
required?: boolean;
enum?: string[];
description?: string;
}

In addition to primitive actions, an application MAY declare domain-specific actions, e.g.:

  • video.create
  • workspace.setup
  • team.invite
  • billing.openSettings

These SHOULD be stable, dot-segmented, and named with domain semantics.

Capability documents MAY additionally specify which entity types a capability or action relates to.

Example optional fields:

  • entityTypes?: string[]
  • targetKinds?: Array<"ui" | "entity">
  1. entityTypes describe the semantic subject of a capability, not its permission.
  2. targetKinds MAY indicate whether an action works only with UI targets, only with entity targets, or with both.
  3. When targetKinds contains entity, the corresponding runtime specification SHOULD define an entityRef semantic.

This is intentionally separated cleanly, because people otherwise tend to squeeze five entirely different risks into a single word.

RiskLevel controls executability.

type RiskLevel =
| "safe" // autonomously executable within granted scopes
| "confirm" // explicit user confirmation required
| "blocked"; // not automatically executable

RiskTag describes the domain-level cause or class of the risk.

type RiskTag =
| "sensitive_data"
| "destructive"
| "external_effect"
| "privileged"
| "billing"
| "security"
| "identity"
| "legal"
| "irreversible";
interface RiskDescriptor {
level: RiskLevel;
tags?: RiskTag[];
reason?: string;
}
  • safe: e.g. opening a route, switching a tab, writing a text suggestion into a draft
  • confirm: e.g. creating a record, sending an invitation, saving settings
  • blocked: e.g. changing a password, triggering a payment, deleting a user, irreversible publish action
  • When multiple risk tags apply, the strictest level wins.
  • blocked MUST NOT be executed autonomously.
  • confirm SHOULD require a human confirmation point before execution.

Success Signals are observable predicates, not wishes and not prompt-based superstition.

type SuccessSignal =
| { kind: "element.appeared"; target: TargetRef }
| { kind: "element.disappeared"; target: TargetRef }
| { kind: "element.state"; target: TargetRef; state: Partial<UIState> }
| { kind: "value.equals"; target: TargetRef; value: string | number | boolean }
| { kind: "route.changed"; pattern?: string; exact?: string }
| { kind: "dialog.opened"; target?: TargetRef }
| { kind: "dialog.closed"; target?: TargetRef }
| { kind: "toast.contains"; text: string }
| { kind: "status.contains"; text: string }
| { kind: "validation.none"; scope?: ScopeId }
| { kind: "collection.count"; target: TargetRef; op: "eq" | "gte" | "lte"; value: number }
| { kind: "entity.created"; entityType: string; idPath?: string }
| { kind: "entity.updated"; entityType: string; idPath?: string }
| { kind: "entity.deleted"; entityType: string; idPath?: string }
| { kind: "network.response"; urlPattern?: string; status?: number }
| { kind: "custom"; name: string; payload?: Record<string, unknown> };
  • Signals SHOULD be declarative and observable.
  • An action descriptor SHOULD contain at least one expected success signal.
  • custom SHOULD only be used when no core signal fits.
  • For critical domain actions, multiple success signals SHOULD be combined.

Example: video.create

  • route.changed to /videos/:id
  • toast.contains = "erstellt"
  • optionally entity.created = "video"

The capability document is the comprehensive description of active capabilities.

interface CapabilityDocument {
modelVersion: "0.1";
profile: ProfileId; // e.g. "[email protected]"
roles: UIRole[];
stateKeys: UIStateKey[];
affordances: UIAffordance[];
actions: ActionDescriptor[];
riskLevels: RiskLevel[];
riskTags?: RiskTag[];
successSignalKinds: string[];
metadata?: Record<string, unknown>;
}
  • roles, stateKeys, affordances, actions, riskLevels MUST be present.
  • An app MUST only declare values that are actually supported.
  • successSignalKinds SHOULD contain all kind values used in the document.

{
"modelVersion": "0.1",
"profile": "[email protected]",
"roles": [
"route",
"dialog",
"form",
"button",
"textbox",
"textarea",
"toast",
"status"
],
"stateKeys": [
"visible",
"enabled",
"focused",
"required",
"open",
"invalid",
"textValue",
"sensitive"
],
"affordances": [
"read",
"focus",
"activate",
"edit",
"submit",
"navigate",
"invoke"
],
"actions": [
{
"id": "ui.activate",
"kind": "primitive",
"targetKinds": ["element"],
"requiredAffordances": ["activate"],
"executionModes": ["appAction", "semanticUi", "inputSynthesis"],
"risk": { "level": "safe" }
},
{
"id": "ui.enterText",
"kind": "primitive",
"targetKinds": ["element"],
"requiredAffordances": ["edit"],
"executionModes": ["appAction", "semanticUi", "inputSynthesis"],
"args": [
{ "name": "text", "type": "string", "required": true }
],
"risk": { "level": "safe" }
},
{
"id": "video.create",
"kind": "domain",
"title": "Video erstellen",
"targetKinds": ["scope"],
"executionModes": ["appAction", "semanticUi"],
"args": [
{ "name": "title", "type": "string", "required": true },
{ "name": "useCase", "type": "string", "required": false }
],
"idempotency": "non_idempotent",
"risk": {
"level": "confirm",
"tags": ["external_effect"]
},
"success": [
{ "kind": "route.changed", "pattern": "/videos/:id" },
{ "kind": "toast.contains", "text": "erstellt" }
]
}
],
"riskLevels": ["safe", "confirm", "blocked"],
"riskTags": [
"sensitive_data",
"destructive",
"external_effect",
"privileged"
],
"successSignalKinds": [
"route.changed",
"toast.contains",
"element.state",
"validation.none"
]
}

  • [UIAP-CORE] UIAP Core v0.1
  • [RFC2119] Key words for use in RFCs to Indicate Requirement Levels, BCP 14
  • Capability documents MAY contain security-relevant information (e.g. available admin actions). Access to capability information SHOULD be restricted to authorized agents.
  • Risk levels MUST be declared correctly; incorrect classification can lead to unintended actions.
  • Sensitive fields SHOULD be marked as such in the capability document.
VersionDateChanges
0.12026-03-27Initial draft