@dreki-gg/pi-browser-tools
Browser automation and web research tools for pi — search, visit, screenshot, interact, and inspect console output
Package details
Install @dreki-gg/pi-browser-tools from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:@dreki-gg/pi-browser-tools- Package
@dreki-gg/pi-browser-tools- Version
0.4.4- Published
- Jun 13, 2026
- Downloads
- 604/mo · 370/wk
- Author
- jalbarrang
- License
- MIT
- Types
- extension
- Size
- 85.5 KB
- Dependencies
- 5 dependencies · 3 peers
Pi manifest JSON
{
"extensions": [
"./extensions/browser-tools"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
@dreki-gg/pi-browser-tools
Browser automation and web research tools for pi.
It adds:
web_searchfor search-engine-backed web discoveryweb_visitfor readable markdown extraction via fetch or the selected browser backendweb_screenshotfor browser screenshots at desktop or mobile sizesweb_interactfor click/type/select/scroll/hover/wait actions on the open pageweb_consolefor captured browser logs, warnings, and uncaught page errors/browserfor a quick browser status check
Install
pi install npm:@dreki-gg/pi-browser-tools
Browser backend: agent-browser (see Browser backend).
Optional agent-browser backend setup:
# Homebrew
brew install agent-browser && agent-browser install
# or npm
npm install -g agent-browser && agent-browser install
Tools
| Tool | Description |
|---|---|
web_search |
Search the web and return up to 10 filtered results |
web_visit |
Fetch a URL and convert it to readable markdown, with optional browser rendering |
web_screenshot |
Take a screenshot of the current page or navigate to a URL first |
web_interact |
Interact with the current browser page and return a fresh screenshot |
web_console |
Read captured browser console output, warnings, errors, and uncaught page errors |
Search providers
Default provider: DuckDuckGo HTML.
Optional env vars:
# Select provider: duckduckgo | google | brave
export WEB_SEARCH_PROVIDER=duckduckgo
# Google Custom Search
export GOOGLE_CSE_API_KEY=...
export GOOGLE_CSE_ID=...
# Brave Search
export BRAVE_SEARCH_API_KEY=...
If WEB_SEARCH_PROVIDER=google, both GOOGLE_CSE_API_KEY and GOOGLE_CSE_ID are required.
If WEB_SEARCH_PROVIDER=brave, BRAVE_SEARCH_API_KEY is required.
Browser backend
Browser-backed tools use agent-browser as the only runtime. Install it with:
# macOS
brew install agent-browser && agent-browser install
# or any platform
npm install -g agent-browser && agent-browser install
If agent-browser is unavailable, browser-backed tools fail with install guidance.
Screenshot analysis (vision model)
web_screenshot and web_interact can optionally hand the screenshot to a vision
model (e.g. Gemini Flash) and return a text description instead of the image —
useful for letting a multimodal model recognize forms, shapes, and layout.
- Per-call opt-in: pass
analyze: true(and optionallyanalyze_prompt: "..."). - Global default: set
WEB_SCREENSHOT_ANALYZE=1(truthy:1,true,yes,on). An explicitanalyzeparam always overrides the env default. - Model selection via
WEB_SCREENSHOT_MODELasprovider:modelId(defaultgoogle:gemini-2.5-flash; a bare value is treated as agooglemodel id).
export WEB_SCREENSHOT_ANALYZE=1
export WEB_SCREENSHOT_MODEL=google:gemini-2.5-flash
The chosen model must have auth configured in pi (API key / OAuth) like any other model.
Notes
web_visituses plain fetch by default and falls back to the selected browser backend when the fetched markdown is too thin.web_interactandweb_consolerequire an open browser session. Open one first withweb_screenshotorweb_visitusingrender: true.web_interact.textresolves against the accessibility snapshot using tiered matching: exact accessible name, exact visible text, then case-insensitive substring on each. A tier is only accepted when it yields a single match — ambiguous text throws and asks for aselector. Preferselectorwhen you need deterministic targeting.web_consoleonagent-browsermerges console messages and page errors, so ordering and level attribution are best-effort.web_visit.details.methodis eitherfetchoragent-browser.- A browser session stays open and is reused across tool calls until the pi session ends (or
close()is called explicitly). There is no idle auto-close, soweb_interact/web_consolekeep working after long gaps followingweb_screenshot. - See
docs/agent-browser-compatibility.mdfor known gaps and backend-specific notes.