Kapture is a Chrome DevTools Extension that enables browser automation through the Model Context Protocol (MCP). It allows AI applications like Claude to control web browsers via a three-layer architecture.
β¨ Key Feature: Multiple AI clients can connect to the same server! Claude Desktop, Cline, and other MCP clients can all control browser tabs through a single server instance.
Kapture bridges AI assistants with web browsers through:
- MCP Server: Handles MCP protocol communication
- Chrome Extension: DevTools panel for browser automation
- WebSocket Bridge: Real-time communication between server and extensions
- Multi-Client Support: Multiple AI clients can connect simultaneously via WebSocket
# Server
cd server
npm install
npm run build
# Test App (optional)
cd test-app
npm install
- Visit the Kapture Chrome Web Store page
- Click "Add to Chrome"
- Confirm the installation
- Open Chrome and navigate to
chrome://extensions/
- Enable "Developer mode"
- Click "Load unpacked"
- Select the
extension
folder
Configure your AI client and open it. It will start the MCP server automatically.
OR
Run the Test App:
cd test-app
npm run dev
The server starts on port 61822.
# Server
cd server
npm start
# Test App
cd test-app
npm start
- Open any website in Chrome
- Open Chrome/Brave Developer Tools (F12 on Windows/Linux, Cmd+Option+I on macOS)
- Navigate to "Kapture" panel
- The extension will automatically connect to the server on port 61822
- Select a server from the dropdown to connect automatically
Add to your Claude Desktop config:
This single command starts the server and handles stdio-to-WebSocket translation:
{
"mcpServers": {
"kapture": {
"command": "npx",
"args": ["-y", "kapture-mcp", "bridge"]
}
}
}
For advanced use cases where you need manual server control:
- Start the server manually:
npx kapture-mcp
- Configure Claude Desktop to connect via WebSocket:
{
"mcpServers": {
"kapture": {
"transport": "websocket",
"url": "ws://localhost:61822/mcp"
}
}
}
Note: This approach requires manually managing the server lifecycle. Use Option 1 (bridge command) for most use cases.
Kapture supports multiple MCP clients connecting to the same server! You can run Claude Desktop, Cline, and other MCP clients simultaneously through a single server instance.
- All MCP clients connect via WebSocket to
ws://localhost:61822/mcp
- All MCP clients share access to the same browser tabs
- Notifications are broadcast to all connected clients
When running npx kapture-mcp
, the command automatically detects if a server is already running:
- No existing server: Starts a new server on port 61822
- Server already running: Shows connection information and exits gracefully
This prevents errors and confusion when multiple clients try to start servers.
Each client should use the same bridge command configuration:
Claude Desktop:
{
"mcpServers": {
"kapture": {
"command": "npx",
"args": ["-y", "kapture-mcp", "bridge"]
}
}
}
Cline/VS Code:
{
"cline.mcpServers": {
"kapture": {
"command": "npx",
"args": ["-y", "kapture-mcp", "bridge"]
}
}
}
Other MCP Clients:
Use the same configuration pattern with "command": "npx"
and "args": ["-y", "kapture-mcp", "bridge"]
.
See the complete multi-assistant guide β
- Parallel Workflows: Have Claude Desktop research while Cline develops code
- Specialized Tasks: Use different AI clients for different types of automation
- Team Collaboration: Multiple team members can use their preferred AI tools simultaneously
- Testing & Development: Test automation scripts with one AI while developing with another
Then ask Claude to interact with web pages:
- "Navigate to example.com and take a screenshot"
- "Click the search button"
- "Fill in the email field with [email protected]"
navigate
- Navigate to URLback
- Browser back buttonforward
- Browser forward buttonreload
- Reload the current page (similar to pressing F5)click
- Click elements (uses first matching element, returns unique selector)hover
- Hover over elements (uses first matching element, returns unique selector)fill
- Fill input fields (uses first matching element, returns unique selector)select
- Select dropdown options (HTML<select>
only, uses first matching element, returns unique selector)keypress
- Send keyboard events to the page or specific elements (supports modifier keys)elements
- Query all elements matching a CSS selector or XPath with optional visibility filtering
Note on Selectors: Tools that accept a selector
parameter (click
, hover
, fill
, select
, keypress
, screenshot
, dom
) will only operate on the first element that matches the CSS selector. The tool response includes the unique selector of the actual element that was used, which may include an auto-generated ID if the element didn't have one.
XPath Support: All tools that accept a selector
parameter also accept an xpath
parameter as an alternative. This is particularly useful for:
- Finding elements by text content:
xpath: "//button[contains(text(), 'Submit')]"
- Complex element relationships:
xpath: "//div[@class='container']//span[2]"
- When CSS selectors are insufficient
Use either selector
OR xpath
, not both. If both are provided, selector
takes precedence.
The keypress
tool simulates keyboard events. It accepts:
key
(required): The key combination to press. Can be:- Single key:
"a"
,"Enter"
,"Tab"
,"Escape"
," "
(space),"Shift"
,"Control"
- With modifiers:
"Control+a"
,"Shift+Tab"
,"Alt+F4"
,"Meta+Shift+p"
- Modifier names:
Control
(orCtrl
),Shift
,Alt
,Meta
(orCmd
/Command
) - Note: When sending just a modifier key (e.g.,
"Shift"
), it's treated as pressing that key alone - Duplicate modifiers are ignored (e.g.,
"Shift+Shift+a"
is same as"Shift+a"
)
- Single key:
selector
orxpath
(optional): Target a specific element. If not provided, sends to document.body
Examples:
// Press Enter
{ "key": "Enter", "selector": "#login-form" }
// Select all text (Ctrl+A)
{ "key": "Control+a", "selector": "#username" }
// Zoom in (Ctrl+Plus)
{ "key": "Control++", "selector": "body" }
// Zoom out (Ctrl+Minus)
{ "key": "Control+-", "selector": "body" }
// New tab (Ctrl+T)
{ "key": "Control+t" }
// Close tab (Ctrl+W or Cmd+W on Mac)
{ "key": "Meta+w" }
kapture://tabs
- List all connected browser tabskapture://tab/{tabId}
- Get detailed information about a specific tabkapture://tab/{tabId}/console
- Get console logs from a specific tab (with pagination support)kapture://tab/{tabId}/screenshot
- Capture screenshots of a tab or elementkapture://tab/{tabId}/dom
- Get HTML content of a tab or elementkapture://tab/{tabId}/elementsFromPoint
- Get elements at specific coordinateskapture://tab/{tabId}/elements?selector={selector}&visible={true|false|all}
- Query all elements matching a CSS selector or XPath with optional visibility filtering
cd server
npm run dev # Development with hot-reload
cd test-app
npm run dev # Run Electron test app
After making changes:
- Go to
chrome://extensions/
- Click refresh on Kapture extension
Server (/server
):
mcp-handler.ts
- MCP protocol implementationwebsocket-manager.ts
- WebSocket servertab-registry.ts
- Tab trackingtools/*.ts
- MCP tool implementations
Extension (/extension
):
panel/command-executor.js
- Command executionpanel/command-queue.js
- Sequential executionbackground.js
- Screenshot service worker
- Automatic Connection - Connects to server on port 61822
- Server Selection - Dropdown to choose between multiple running servers
- Connection Status - Real-time server connection indicator
- Tab Info - Current tab ID and URL display
- Command Testing - Manual command execution interface
- Console Viewer - Live console log capture
- History - Command history
- Dark Theme - Toggle between light/dark modes
- The extension will automatically connect to the server on port 61822
- If no servers are found, verify the server is running
- Check the server dropdown to see which servers were discovered
- Check browser console for errors
- Check server logs in the terminal
- Ensure extension is loaded and enabled
- Close and reopen DevTools
- Reload extension in
chrome://extensions/
- Default timeout is 5 seconds
- Some commands accept custom timeout parameter
- Check element selectors are correct
Important: The click
and hover
tools may experience performance issues when the Kapture DevTools panel is not the active/selected tab in Chrome DevTools. For optimal performance:
- Keep the Kapture panel selected during automation
- If experiencing delays with click/hover operations, switch to the Kapture panel
- This is due to Chrome's optimization of inactive DevTools panels
- Commands execute within Chrome's DevTools sandbox
- Each tab has unique ID preventing cross-tab interference
- No direct file system access from extension
- Tab registry enforces command isolation
MIT