The Problem with Current Approaches
Most solutions for connecting AI to the web fall into two categories: browser automation and remote APIs. Both have significant limitations.Browser Automation is Fundamentally Inefficient
When current AI tries to interact with a website using browser automation, here’s what actually happens:1
Take a screenshot
Capture the current page state (or parse the DOM)
2
Ask the model
“Where’s the ‘Add to Cart’ button?”
3
Model responds
Provides coordinates or element selector
4
Click the button
Execute the interaction
5
Wait for update
Page reloads or updates
6
Verify
Take another screenshot and ask: “Did that work?”
7
Repeat
For every single interaction
Remote MCP Doesn’t Solve the Same Problem
Remote MCP servers are designed for cloud-based, multi-tenant scenarios. While powerful, they come with significant challenges:Authentication Complexity
Authentication Complexity
Remote MCP requires OAuth2.1, which is currently only usable by local clients like Claude Desktop. Re-inventing auth for agents that act on behalf of users requires a complete reimagining of authorization systems.
Multi-Tenant Security Risks
Multi-Tenant Security Risks
Data leakage in multi-tenant apps with MCP servers is not a solved problem. Most remote MCPs that operate on user data are read-only for this reason.
No Human in the Loop
No Human in the Loop
Remote MCPs are built for autonomous agents, but current models aren’t reliable enough for fully autonomous work. For important tasks, humans need to be in the loop.
Premature Optimization
Premature Optimization
We’re building infrastructure for autonomous cloud agents when what we actually need is human-supervised browser automation.
Remote MCP is great for server-to-server communication and future autonomous agents. But for human-supervised web interactions happening right now, the browser is the right place.
The WebMCP Approach
Instead of teaching AI to use websites like a human, WebMCP lets websites expose their functionality directly as tools.Function Calls > UI Navigation
Compare these three approaches:- Computer Use
- Playwright MCP
- WebMCP
What it does:Issues:
- Non-deterministic
- Requires vision model
- Breaks when UI changes
- Slow (multiple round trips)
Comparison Matrix
| Approach | Determinism | Speed | UI Resilience | Auth Model | Human Oversight |
|---|---|---|---|---|---|
| Computer Use | Low | Slow | Breaks easily | Complex | Minimal |
| Playwright MCP | Medium | Medium | Breaks on changes | Complex | Minimal |
| BrowserMCP | Medium | Medium | Breaks on changes | Complex | Minimal |
| Remote MCP | High | Fast | N/A (no UI) | OAuth2.1 | Optional |
| WebMCP | High | Fast | UI-independent | Inherited | Built-in |
The key insight: When you call
shop.addToCart({id: "abc123", quantity: 2}), it either works or throws a specific error. When you try to click a button, you’re hoping the UI hasn’t changed, the element loaded, the viewport is right, and a dozen other things outside your control.Good Websites Are Context Engineering
One of the biggest challenges in AI is context engineering - making sure the model only has context relevant to its task.WebMCP as a UI for LLMs
Just as websites don’t put all content on one page, you can scope tools to different pages in your app: Instead of overwhelming the model with all possible tools, you can:- Limit tools based on current URL
- Show different tools based on user role
- Expose tools progressively as tasks advance
- Remove tools when components unmount
Why the Browser is the Right Place
The browser provides several unique advantages for human-supervised AI:1. Authentication is Already Solved
WebMCP
Tools inherit the user’s existing session cookies, auth headers, and permissions. No additional auth needed.
Remote MCP
Requires OAuth2.1 implementation, token management, and complex multi-tenant authorization.
2. User-Scoped by Default
Client-side APIs in multi-tenant apps are already scoped to the current user. There’s no risk of data leakage across tenants because the tools run with the same permissions as the user’s browser session.3. Human Visibility
The browser serves as both:- UI for the human - See exactly what’s happening
- UI for the LLM - Structured tools and clear context
4. Damage Limitation
Websites only expose tools they’d already expose as buttons or forms. If a website wants to expose a “delete all user data” tool, that’s their choice - no different than putting a big red delete button on the page.
- Running in the user’s browser context only
- Respecting same-origin policy
- Requiring explicit tool registration
- Making all actions visible to the user
5. Zero Backend Changes
Add AI capabilities to your website without:- Deploying new backend services
- Implementing OAuth flows
- Managing API credentials
- Setting up multi-tenant isolation
An Admission, Not a Prediction
WebMCP is an admission that AGI is not happening tomorrow. If we’re serious about automating parts of white-collar work, we need to build infrastructure for the models we have, not the models we wish we had.
- Current models work best with text and function calls
- Humans need to be in the loop for important work
- Determinism and reliability matter more than autonomy
- Web APIs are more robust than pixel parsing
WebMCP vs Alternatives: Quick Reference
vs. Computer Use / Anthropic API
vs. Computer Use / Anthropic API
When to use Computer Use:
- Apps without APIs or structured interfaces
- One-off automation tasks
- Exploring unfamiliar interfaces
- Websites you control or want to enhance
- Repeated, reliable operations
- When users need visibility into actions
- Performance-critical operations
vs. Playwright MCP / Selenium
vs. Playwright MCP / Selenium
When to use browser automation:
- Testing your website
- Scraping data from sites you don’t control
- Temporary automation scripts
- Building features into your website
- Long-term, maintainable integrations
- When you control the website
- When determinism matters
vs. Remote MCP Servers
vs. Remote MCP Servers
When to use Remote MCP:
- Server-to-server communication
- Backend data access without UI
- Future fully-autonomous agents
- Multi-step cloud workflows
- Human-in-the-loop workflows
- Browser-based user actions
- Leveraging existing web sessions
- When users need to see results
vs. BrowserMCP
vs. BrowserMCP
BrowserMCP is another browser automation approach similar to Playwright MCP.WebMCP difference:
- Direct function calls vs. DOM manipulation
- Deterministic vs. heuristic
- Website-defined tools vs. autonomous navigation
- Fast execution vs. multi-step verification
