The browser interface AI agents deserve.
AWI is an MCP server that gives AI agents a compact, semantic view of any web page — instead of 40,000 bytes of DOM they can't reason over.
- Stable element IDs
- Token-efficient snapshots
- No raw DOM noise
- Open-source
- MIT-licensed
- Works with Claude, GPT-4o, Gemini, Ollama
- Requires Node.js + Chrome
The Problem
Browser automation is easy for scripts.
Hard for agents.
Current tools dump thousands of tokens of noise into the agent's context. Tasks fail mid-flow. Selectors break on the next deploy. And there's no stable way to reference an element across calls.
Live Example
See what your agent sees.
Every MCP tool call returns a structured semantic snapshot — not raw DOM.
The agent references btn-checkout by eid across every call that follows — no selector hunting, no re-finding elements after navigation.
How It Works
Five steps. One install command.
Agent calls a browser tool via MCP
navigate, click, find, type, screenshot…
AWI intercepts via MCP
A local server launched with npx — no daemon to manage
Puppeteer drives Chrome
Local Chrome via CDP — real rendering, full JS
Page reduced to semantic XML
Headings, buttons, links, forms — no raw markup
Agent receives stable eids
Reference the same button across 10 tool calls. No re-finding elements.
AWI runs locally and needs Node.js and a Chrome install. For serverless or shared deployments, see AWI Cloud.
Features
What agents get that Playwright doesn't.
Semantic Snapshots
Regions, headings, links, buttons — not 40,000 bytes of DOM. Agents reason over structure, not markup.
Stable Element IDs
Every interactive element gets a stable eid. Reference it across 10 tool calls — CSS classes change on the next deploy, eid doesn't.
Token-Efficient
Snapshots return only the structure an agent needs to act — not the full page on every call. Longer task horizons. Lower token spend.
Network Inspection
See every request that followed an action — verify form submissions, trace auth flows, and debug redirects without a devtools tab.
Canvas Inspection
When the page is a canvas, chart, or image, capture a screenshot or read canvas data directly. Semantic snapshots where they work; pixels where they don't.
Model-Agnostic
Works with any MCP-compatible agent runtime: Claude, GPT-4o, Gemini, local models. No vendor lock-in.
What People Build
Agents that act on the live web.
Web Research Agents
Navigate, read, and extract from live pages — pricing monitors, news aggregators, and competitor trackers that work on real rendered content, not stale APIs.
QA & Flow Automation
Walk through sign-up, checkout, and onboarding flows, fill forms, and verify responses — without the fragile CSS selectors that break Playwright suites.
Data Entry & Form Filling
Log in, find fields by their label, and submit — even on pages where the DOM shifts between visits and class names change on every deploy.
Managed Cloud
Don't want to manage Chrome?
AWI Cloud runs headless Chrome for you. Connect via API key in seconds, share sessions across your team, and pay only for the browser time you use.
- No local Chrome or Node.js to install — just an API key
- Share browser sessions and credentials across your team
- Pay by the minute — no seat fees, no commitments
- Start free, upgrade when you scale
Get Started
One command. Agents browsing.
Open-source · MIT-licensed · Works with Claude, GPT-4o, Gemini, and any MCP-compatible agent.