Why UIAP?

The Core Problem

AI agents need more than just knowing “there’s a button”. They need to know:

What the button does in business terms (video.create, not just click)
Whether they’re allowed to click it (risk level, policy)
How to verify the action succeeded (route change, toast, dialog close)

ARIA describes roles, states, and properties — but not business logic. Playwright automates browsers — but doesn’t understand intent. MCP connects tools — but doesn’t model live UI state.

What Existing Protocols Do

MCP (Model Context Protocol)

Standardizes access to tools and knowledge, not direct control of running web UIs.

AG-UI

Event-based protocol for agent-to-app communication. Closest match, but:

Solves the communication problem, not the full UI understanding problem
No canonical catalog for UI elements, roles, risks, or success signals
Doesn’t define how an agent obtains a reliable page model

WebDriver / WebDriver BiDi

Browser remote control — the execution layer beneath, not the semantic contract above.

ARIA / Accessibility

Semantic description of UI elements — essential foundation, but not a complete solution for agentic control.

The Gap

There is no mature, broadly accepted “UI Control Protocol” standard for AI agents in existing web applications that combines semantics, actions, feedback, and safety rules in one package.

What UIAP Provides

UIAP (UI Agent Protocol) is a standardized agent interface for UIs. Applications use it to tell AI agents in a structured way which elements, actions, states, risks, and success signals exist — so agents don’t have to blindly poke around in the DOM.

UIAP doesn’t compete with MCP or AG-UI. It’s adapter-capable:

MCP Adapter: Expose UIAP capabilities or workflows as MCP tools/resources
AG-UI Adapter: Mirror UIAP events into AG-UI events
WebDriver/BiDi Adapter: For external browser execution