pi-browser-cdp-extension
Pi coding-agent extension exposing BrowserCode CDP browser_execute
Package details
Install pi-browser-cdp-extension from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-browser-cdp-extension- Package
pi-browser-cdp-extension- Version
1.1.0- Published
- May 13, 2026
- Downloads
- 275/mo · 16/wk
- Author
- ego_agent_lab
- License
- MIT
- Types
- extension
- Size
- 2.5 MB
- Dependencies
- 0 dependencies · 2 peers
Pi manifest JSON
{
"extensions": [
"./extensions"
],
"image": "https://raw.githubusercontent.com/citrolabs/pi-browser-cdp-extension/main/logo.png"
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
A CDP-powered browser execution extension for Pi. It adds a BrowserCode-style browser_execute tool to pi-coding-agent, allowing Pi to connect to Chromium/Chrome through the DevTools Protocol, run JavaScript, drive pages, inspect the DOM, capture screenshots, and return screenshots as image results.
The motivation is simple: pi-coding-agent is excellent for code work, but it does not provide built-in web search or browser access. This project gives Pi a small, explicit bridge to a user-authorized browser, so an agent can work with live web pages when the task requires it.
This is not a standalone browser testing framework and does not host a daemon. It is a Pi extension that reuses a persistent CDP session inside the Pi process.
中文文档: README.zh-CN.md
Quick Start
1. Install the extension
pi install git:github.com/citrolabs/pi-browser-cdp-extension
For local development:
pi install .
After installation, talk to Pi normally and ask it to use the browser. Pi can call the extension's browser_execute tool when it needs to operate a real page.
Example:
Open https://example.com in the browser, tell me the page title, and return a screenshot.
Pi will connect to an authorized Chromium browser, drive the page, inspect the result, and attach the screenshot.
What it gives Pi
browser_execute: Pi-callable tool name.session: persistent CDP session; multiple calls in the same Pi session reuse browser state.console: captureslog,error,warn,info, anddebugoutput and streams it back in the tool result.- Screenshot collection: successful
Page.captureScreenshotcalls are automatically converted into Pi image content. - Workspace support: reusable scripts can live in
.pi/browser-execute-workspaceand be loaded from snippets withawait import(...).
Why not just web search?
Web-search tools help Pi find and summarize information. pi-browser-cdp-extension gives Pi hands-on control of a real Chromium browser, so it can complete tasks that search/fetch tools cannot represent as plain text.
| Capability | pi-web-access / @ollama/pi-web-search |
pi-browser-cdp-extension |
|---|---|---|
| Search the public web | Strong fit | Not the primary goal |
| Fetch and summarize static pages | Strong fit | Possible, but usually overkill |
| Click buttons, type into forms, and follow UI flows | Limited or unavailable | Native browser automation through CDP |
| Use authenticated sessions | Usually requires API-level access or copied cookies | Reuses the user's authorized browser profile/session |
| Work with browser extensions and real browser behavior | No | Yes, because Pi drives the actual browser |
| Inspect dynamic DOM state after JavaScript runs | Limited to fetched HTML or rendered text | Direct live DOM and DevTools Protocol access |
| Verify what the user would actually see | Text-first | Screenshots returned as Pi image results |
| Keep state across multiple agent steps | Tool/backend dependent | Persistent CDP session inside the Pi process |
Use web-search packages when the task is "find information." Use this extension when the task is "operate the website."
Who should use this
Use this when you need:
- Pi to operate a real Chrome page instead of only reading HTML.
- Login state, browser extensions, real browser behavior, or direct DevTools Protocol access.
- A coding agent to reuse one browser session across multiple tool calls.
Do not use this for:
- Pure unit testing; Playwright or Vitest is more direct.
- Untrusted pages or untrusted CDP endpoints. CDP can control the connected browser, so only connect to browsers you authorize.
Configuration
Environment variables:
BU_CDP_WS/BU_CDP_URL: default browser WebSocket endpoint used bysession.connect().BCODE_SCREENSHOT_DIR: optional directory where screenshots are also dumped locally.
One-off extension load:
pi -e ./extensions/browser-execute.ts
Validation
The repository covers core execution, CDP session helpers, and the Pi extension adapter.
npm run typecheck
npm test
Current tests cover session reuse/isolation, workspace imports, console streaming, timeout handling, screenshot collection, CDP target filtering, active sessionId routing, and Pi image content conversion.
Acknowledgements
The shape of this project was inspired by the following work:
