On this page6 sections
ReferenceIntermediate5-7 min reference

Playwright MCP

A working reference for Microsoft's Playwright MCP server (@playwright/mcp). Covers installation across clients (Claude Code, Cursor, Claude Desktop, Copilot), the core tool surface, capability flags, CLI+SKILLs as an alternative for coding agents, and the patterns most teams reach for in production. As of May 2026, latest version is v0.0.70 (released April 2026) — syntax should be stable but always check the official docs at playwright.dev/mcp for breaking changes.

Installation & first run

Install with npx

Run the MCP server on demand without global install.

npx @playwright/mcp@latest

First run downloads the package and a browser; subsequent runs are fast.

Claude Code setup

Add the MCP server to Claude Code's per-project config.

claude mcp add playwright npx @playwright/mcp@latest

Settings persist to ~/.claude.json under the current project. Use --scope global to install across all projects.

Claude Desktop setup

Edit claude_desktop_config.json to register the server.

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["@playwright/mcp@latest"]
    }
  }
}

File path is ~/Library/Application Support/Claude/claude_desktop_config.json on macOS; restart Claude Desktop for changes to take effect.

Cursor setup

Add via Cursor Settings → MCP → Add new MCP Server.

{
  "playwright": {
    "command": "npx",
    "args": ["@playwright/mcp@latest"]
  }
}

Cursor exposes the config UI but the underlying file is the same shape as Claude Desktop's.

Verify installation

Confirm the server loaded and view available tools.

/mcp

Run this slash command inside your MCP client (Claude Code, Cursor, etc.). It lists the connected servers and their exposed tools.

Core browser tools

These are always available — the core capability cannot be disabled.

browser_navigate

Navigate to a URL.

{
  "tool": "browser_navigate",
  "arguments": { "url": "https://example.com" }
}

browser_navigate_back

Go back in browser history.

{ "tool": "browser_navigate_back" }

browser_snapshot

Capture the page's accessibility tree as a structured snapshot. Preferred over screenshots because it's token-efficient and deterministic.

{ "tool": "browser_snapshot" }

Returns ARIA structure (roles, names, refs) plus the URL and title. Each visible element gets a ref you can use in subsequent browser_click / browser_type calls.

browser_click

Click an element by its accessibility ref or selector.

{
  "tool": "browser_click",
  "arguments": { "element": "Login button", "target": "e5" }
}

Pass element as a human-readable description for permission prompts, and target from the snapshot. The element text is what the agent shows the user when asking permission to act.

browser_type

Type text into an input identified by ref.

{
  "tool": "browser_type",
  "arguments": { "element": "Email input", "target": "e12", "text": "user@example.com" }
}

Set submit: true to press Enter after typing, or slowly: true to fire individual keypress events (for inputs that listen to keystrokes vs blur).

browser_press_key

Send a keyboard key by name.

{
  "tool": "browser_press_key",
  "arguments": { "key": "Escape" }
}

Use physical key names (Escape, Enter, ArrowDown), not characters.

browser_wait_for

Wait for text to appear, disappear, or a duration to pass.

{
  "tool": "browser_wait_for",
  "arguments": { "text": "Order confirmed", "time": 10 }
}

text waits for visible text; textGone waits for it to disappear; time is a fixed timeout in seconds — use sparingly, prefer signals.

browser_take_screenshot

Capture a screenshot — full page, viewport, or a single element.

{
  "tool": "browser_take_screenshot",
  "arguments": { "filename": "checkout-confirmed.png", "fullPage": true }
}

With --output-dir set on the server, the file is saved to disk and the response includes the path. Without it, the screenshot returns as base64 in the response.

browser_evaluate

Run a JavaScript function in the page context.

{
  "tool": "browser_evaluate",
  "arguments": { "function": "() => document.title" }
}

The function must be a string of valid JavaScript — typically an arrow function. The return value is serialised back to the agent.

browser_close

Close the browser and end the session.

{ "tool": "browser_close" }

Profiles are ephemeral by default. To preserve session state, use the storage capability and browser_storage_state before closing.

Capabilities

The MCP server's tool surface scopes to enabled capabilities. By default, only core is enabled — opt in to more via --caps= to expose additional tools.

--caps overview

Capabilities scope which tools the LLM sees. Fewer capabilities = fewer tools = clearer agent reasoning.

npx @playwright/mcp@latest --caps=storage,testing

Available capability names: core (always on), network, storage, testing, vision, pdf, devtools, config.

--caps=network

Adds network mocking and state control. Useful for offline testing, request interception, and fixture seeding.

Tools added: browser_route, browser_route_list, browser_unroute, browser_network_state_set.

--caps=storage

Cookie, localStorage, sessionStorage management. Essential for authenticated session reuse.

npx @playwright/mcp@latest --caps=storage

Adds storage tools for cookies, localStorage, sessionStorage, plus browser_storage_state (save) and browser_set_storage_state (restore). The save/restore pair is the workhorse for session reuse — see Common patterns.

--caps=testing

Adds assertion tools and a locator-generation tool.

Tools added: browser_verify_element_visible, browser_verify_text_visible, browser_verify_value, browser_verify_list_visible, browser_generate_locator. Use these to add assertion-style checks inside agent runs and to scaffold locator code for converting agent flows into standard Playwright tests.

--caps=vision

Coordinate-based mouse tools for screenshot-driven workflows. Requires a vision-capable LLM.

Tools added: browser_mouse_click_xy, browser_mouse_move_xy, browser_mouse_drag_xy, plus low-level browser_mouse_down, browser_mouse_up, browser_mouse_wheel. Coordinates are absolute pixels. Useful for canvas elements, image-rendered UIs, and anti-bot defences that obscure the DOM.

--caps=pdf

PDF generation from the current page.

Tools added: browser_pdf_save.

--caps=devtools

Tracing, video recording, and test debugging tools.

Tools added: browser_start_tracing / browser_stop_tracing, browser_start_video / browser_stop_video / browser_video_chapter, plus interactive helpers browser_highlight, browser_pick_locator, and browser_resume for step-by-step debugging. Use traces to diagnose why an agent run failed — they capture every tool call and the resulting page state.

--caps=config

Adds tools for inspecting the merged runtime configuration.

Tools added: browser_get_config.

Useful when an agent's behaviour seems wrong and you need to verify what config flags actually took effect after merging CLI args, env vars, and config files.

CLI flags & environment variables

--headless

Run without a visible browser window.

npx @playwright/mcp@latest --headless

--browser

Choose Chromium, Firefox, or WebKit.

npx @playwright/mcp@latest --browser firefox

--viewport-size and --device

Set the browser viewport. --device uses one of 143+ Playwright device presets (sets user-agent, touch, scale factor).

npx @playwright/mcp@latest --viewport-size 1280x720
npx @playwright/mcp@latest --device "iPhone 13"

--output-dir

Where the server saves screenshots, traces, and other artefacts.

npx @playwright/mcp@latest --output-dir ./test-output/

--ignore-https-errors

Bypass invalid certificate warnings.

npx @playwright/mcp@latest --ignore-https-errors

For local dev with self-signed certs only — don't ship this to a CI config that hits external sites.

Environment variables

Configuration without command-line flags.

PLAYWRIGHT_MCP_CAPS=storage,testing
PLAYWRIGHT_MCP_ALLOWED_ORIGINS="example.com;api.example.com"
PLAYWRIGHT_MCP_BLOCKED_ORIGINS="ads.example.com"
PLAYWRIGHT_MCP_ALLOW_UNRESTRICTED_FILE_ACCESS=1

ALLOWED_ORIGINS and BLOCKED_ORIGINS are convenience filters, not security boundaries. Don't rely on them for trust.

CLI+SKILLs alternative

When to use CLI instead of MCP

For coding agents (Claude Code, Cursor, Copilot CLI) that need browser automation as one tool among many, CLI is roughly 4× more token-efficient than MCP per task.

MCP is right when the agent is iteratively reasoning about page state and needs full snapshot context. CLI is right when the agent is executing pre-planned steps with browser as a side tool. See MCP vs CLI+SKILLs: when each pattern wins for the full decision framework.

Install the CLI

Microsoft's @playwright/cli package, distributed separately from MCP.

npm install -g @playwright/cli@latest
playwright-cli --help
playwright-cli install --skills

The install --skills step installs reference guides that Claude Code, GitHub Copilot, and similar coding agents will read automatically. Without it, point your agent at playwright-cli --help and it'll figure things out from there.

Common CLI commands

Same Playwright capabilities, concise commands instead of MCP tool calls.

playwright-cli open https://example.com
playwright-cli snapshot
playwright-cli click e15
playwright-cli fill e12 "user@example.com"
playwright-cli fill e12 "user@example.com" --submit
playwright-cli press Enter
playwright-cli screenshot

The CLI uses fill (not type) for setting input values — same naming convention as Playwright itself. --submit on fill presses Enter after typing. Refs (e15, e12) come from a prior snapshot call, same as MCP. The CLI saves snapshots and screenshots to disk by default, returning a file path — the agent reads them on demand instead of streaming them into context.

Common patterns

Persistent authenticated sessions

Save storage state once, reuse across runs to skip login flows.

  1. Start with --caps=storage
  2. Agent logs in via the UI
  3. Agent calls browser_storage_state with filename "auth.json"
  4. Future runs: browser_set_storage_state with the same filename

Storage state is account-scoped — keep separate files per test user.

Capability scoping for token cost

The fewer tools the LLM sees, the cleaner its reasoning and the lower the per-task cost.

npx @playwright/mcp@latest --caps=storage

A reasonable production default is --caps=storage only. Add testing if you're scaffolding test code; add devtools for failure diagnosis. Avoid leaving everything enabled in CI — it costs tokens and confuses the agent.

Screenshot diff workflow

Use the agent for visual regression by comparing screenshots across runs.

  1. browser_resize to a fixed viewport (consistent baseline)
  2. browser_take_screenshot to capture current state
  3. Compare against baseline (pixelmatch, ImageMagick, or a service)
  4. If diff exceeds threshold, the agent flags or auto-fixes

Works well for incremental design changes; less reliable for comparing across browsers or significantly different content.

Vision mode for canvas and anti-bot pages

When the DOM is unreliable (canvas-rendered UIs, image-heavy pages, anti-bot defences that obscure structure), switch to vision mode coordinate-based clicks.

npx @playwright/mcp@latest --caps=vision

The agent then uses browser_mouse_click_xy with coordinates from the screenshot. Slower, more expensive, and less reliable than DOM-driven on common tasks — use as a fallback, not a primary mode.