Skip to content

Why UIAP?

AI agents need more than just knowing “there’s a button”. They need to know:

  • What the button does in business terms (video.create, not just click)
  • Whether they’re allowed to click it (risk level, policy)
  • How to verify the action succeeded (route change, toast, dialog close)

ARIA describes roles, states, and properties — but not business logic. Playwright automates browsers — but doesn’t understand intent. MCP connects tools — but doesn’t model live UI state.

Standardizes access to tools and knowledge, not direct control of running web UIs.

Event-based protocol for agent-to-app communication. Closest match, but:

  • Solves the communication problem, not the full UI understanding problem
  • No canonical catalog for UI elements, roles, risks, or success signals
  • Doesn’t define how an agent obtains a reliable page model

Browser remote control — the execution layer beneath, not the semantic contract above.

Semantic description of UI elements — essential foundation, but not a complete solution for agentic control.

There is no mature, broadly accepted “UI Control Protocol” standard for AI agents in existing web applications that combines semantics, actions, feedback, and safety rules in one package.

UIAP (UI Agent Protocol) is a standardized agent interface for UIs. Applications use it to tell AI agents in a structured way which elements, actions, states, risks, and success signals exist — so agents don’t have to blindly poke around in the DOM.

UIAP doesn’t compete with MCP or AG-UI. It’s adapter-capable:

  • MCP Adapter: Expose UIAP capabilities or workflows as MCP tools/resources
  • AG-UI Adapter: Mirror UIAP events into AG-UI events
  • WebDriver/BiDi Adapter: For external browser execution