Agent Integration Guide
Overview
Section titled “Overview”This guide explains how an LLM-based agent uses UIAP to understand and interact with a web application. UIAP provides the agent with structured context instead of raw HTML or screenshots.
What the Agent Receives
Section titled “What the Agent Receives”1. Capabilities
Section titled “1. Capabilities”After session initialization, the agent receives a Capability Document describing:
- Available UI roles and states
- Registered actions with risk levels
- Success signal types
- Supported execution modes
2. Page Snapshots
Section titled “2. Page Snapshots”The PageGraph is a structured representation of the current page state:
{ "route": { "routeId": "videos.list", "url": "/videos" }, "documents": [{ "scopes": [{ "id": "video.list", "kind": "list", "elements": [ { "stableId": "video.new", "role": "button", "name": "New Video", "affordances": ["activatable"], "state": { "enabled": true, "visible": true }, "defaultAction": "video.create" } ] }] }]}3. Deltas
Section titled “3. Deltas”After each action or state change, the agent receives incremental updates instead of full snapshots.
Agent Loop
Section titled “Agent Loop”A typical agent loop follows this pattern:
1. Receive snapshot/delta2. Understand current state3. Plan next action based on goal4. Send action.request5. Wait for action.result6. Verify success via signals7. Repeat or completeExecution Modes
Section titled “Execution Modes”UIAP supports multiple execution strategies, in order of preference:
appAction— The app executes its own business logic directlysemanticDom— The SDK interacts with DOM elements by semantic identitybrowserInput— Low-level input simulation (click, type)webdriver— External browser automation (fallback)vision— Screenshot-based interaction (last resort)
The agent should prefer higher-level modes. UIAP’s design principle: DOM-first, vision-second, computer-use-last-resort.
Policy Awareness
Section titled “Policy Awareness”Before executing any action, the agent must respect the policy response:
allow— Proceedconfirm— Request user confirmation firstdeny— Action is not permittedhandoff— Human must perform this step manually
Success Verification
Section titled “Success Verification”After executing an action, the agent verifies success through signals:
success: [ { kind: "route.changed", pattern: "/videos/:id" }, { kind: "toast.contains", text: "created" }]This makes agent behavior deterministic and verifiable, not just “it looked like it worked.”