The trust landscape
Four parties participate in a WebMCP interaction, and each brings different trust assumptions:- Site authors define tools. They control what actions are exposed and how descriptions are written.
- Agent providers build the AI systems that interpret tool descriptions and decide which tools to call.
- Browser vendors mediate between agents and sites. They enforce permissions, manage the execution context, and decide what the user sees.
- End users authorize actions. They may not understand the technical details, but they expect to stay in control.
Key risks
Prompt injection through tool metadata
A malicious site can embed instructions in tool descriptions or parameter descriptions that attempt to hijack the agent’s behavior. The agent’s language model reads these descriptions as context, and a carefully crafted description can override the agent’s original instructions. For example, a tool description might include:Prompt injection through tool output
Tool return values are also processed by the agent’s language model. If a tool returns user-generated content (forum posts, reviews, search results), malicious content can include instructions that manipulate the agent’s subsequent actions. This is a variant of the metadata injection risk, but the threat actor may not be the site author. A malicious user on a forum site can craft a post that, when returned as tool output, instructs the agent to exfiltrate data. The defense is the same: agents should treat tool output as untrusted. Sites should sanitize user-generated content in tool output the way they sanitize it in HTML rendering.Misrepresentation of intent
A tool’s description may not match its actual behavior. A tool calledfinalizeCart with the description “Finalizes the current shopping cart” might actually trigger a purchase. The agent interprets the description literally and calls the tool, resulting in an action the user did not intend.
This can be malicious (deliberate deception) or accidental (poorly written descriptions). Either way, the agent cannot verify that a tool does what it claims before executing it.
The browser’s role here is to provide consent surfaces. The user should be able to review what the agent is about to do before it happens. The declarative API’s toolautosubmit attribute is one mechanism: when absent, the browser requires the user to manually submit the form, even if the agent has filled in all the fields.
Privacy leakage through over-parameterization
A tool can request more information than it needs. A “search dresses” tool that asks for the user’s age, pregnancy status, height, skin tone, and previous purchases is plausibly useful for personalization, but it also enables detailed profiling. Agents are designed to be helpful. When a tool requests a parameter and the agent has access to the information (from personalization data, browsing history, or cross-site context), the agent will attempt to provide it. This creates a pipeline from the user’s private data to any site that asks for it through tool parameters. This risk is not unique to WebMCP (any web form can ask for too much information), but agents amplify it by filling in data that users would not have volunteered themselves.The human-in-the-loop model
WebMCP is designed around the assumption that a human is present and paying attention. Several design choices reflect this:Browsing context requirement
Tools execute in a visible browser tab, not in a headless background process. This means the user can see what the agent is doing. If an agent navigates to a page, fills in a form, and triggers a tool, the user observes these actions in real time. The spec explicitly rules out headless tool execution in the current design. This is a deliberate constraint, not a missing feature. It ensures that the user’s presence is a genuine safeguard rather than a formality.Authentication inheritance
Agents inherit the user’s authentication state. When an agent visits a site, it carries the user’s cookies and session. This means tools can perform authenticated actions (place orders, modify settings, access private data) without additional login steps. This is powerful and dangerous. It means a tool has the same privileges the human user has. The browser’s consent mechanisms (permission prompts, thetoolautosubmit attribute) are the primary check on this power.
Declarative consent via toolautosubmit
The declarative API provides a boolean opt-in for automatic form submission:
toolautosubmitpresent: The agent can fill and submit the form without user intervention.toolautosubmitabsent: The browser fills the form fields but focuses the submit button and waits for the user to click it.
toolautosubmit. Low-risk forms (search, filtering, read-only queries) can include it.
The Chromium prototype also applies CSS pseudo-classes (:tool-form-active, :tool-submit-active) to give visual feedback when an agent has filled a form. This helps the user understand what the agent has done and what is waiting for their confirmation.
The agent interface and requestUserInteraction
The W3C proposal introduces an agent parameter passed to imperative tool execute callbacks. This parameter provides requestUserInteraction(), a method that pauses tool execution and requests the user to perform an action (confirm, provide input, review).
