@arpagon/pi-web-providers
Configurable web access extension for pi with per-tool provider routing for search, contents, answers, and research across Claude, Cloudflare, Codex, Custom CLI, Exa, Gemini, Perplexity, Parallel, and Valyu.
Package details
Install @arpagon/pi-web-providers from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:@arpagon/pi-web-providers- Package
@arpagon/pi-web-providers- Version
1.2.0- Published
- Mar 19, 2026
- Downloads
- 38/mo Β· 11/wk
- Author
- arpagon
- License
- MIT
- Types
- extension
- Size
- 299.8 KB
- Dependencies
- 8 dependencies Β· 0 peers
Pi manifest JSON
{
"extensions": [
"./dist/index.js"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
π pi-web-providers
A meta web extension for pi that routes search, content extraction, answers, and research through configurable per-tool providers.
Why?
Most web extensions hard-wire a single backend. pi-web-providers lets you
mix and match providers per tool instead, so web_search, web_contents,
web_answer, and web_research can each use a different backend or be turned
off entirely.
β¨ Features
- Multiple providers β Claude, Cloudflare, Codex, Custom CLI, Exa, Gemini, Perplexity, Parallel, Valyu
- Batched search and answers β run several related queries in a single
web_searchorweb_answercall and get grouped results back in one response - Async contents prefetch β optionally start background
web_contentsextraction fromweb_searchresults and reuse the cached pages later
π¦ Install
pi install npm:pi-web-providers
βοΈ Configure
Run:
/web-providers
This edits the global config file ~/.pi/agent/web-providers.json. The
settings UI mirrors the three sections below: tools, providers, and generic
settings.
Each tool can be routed to any compatible provider:
| Provider | search | contents | answer | research | Auth |
|---|---|---|---|---|---|
| Claude | β | β | Local Claude Code auth | ||
| Cloudflare | β | CLOUDFLARE_API_TOKEN + CLOUDFLARE_ACCOUNT_ID |
|||
| Codex | β | Local Codex CLI auth | |||
| Exa | β | β | β | β | EXA_API_KEY |
| Gemini | β | β | β | GOOGLE_API_KEY |
|
| Perplexity | β | β | β | PERPLEXITY_API_KEY |
|
| Parallel | β | β | PARALLEL_API_KEY |
||
| Valyu | β | β | β | β | VALYU_API_KEY |
Advanced option: custom-cli is a configurable adapter provider that can route
any managed tool through a local wrapper command using a JSON stdin/stdout
contract.
See example-config.json for a full default
configuration.
Tools
Each managed tool maps to one provider id or null for off under the top-level
tools key. A tool is only exposed when it is mapped to a compatible provider
and that provider is currently available. Tool-specific settings live under
toolSettings; today this covers toolSettings.search.prefetch.
web_search
Search the public web for up to 10 queries in one call. It returns grouped titles, URLs, and snippets for each query.
| Parameter | Type | Default | Description |
|---|---|---|---|
queries |
string[] | required | One or more search queries to run (max 10) |
maxResults |
integer | 5 |
Result count per query, clamped to 1β20 |
options |
object | β | Provider-specific search options and local prefetch settings |
web_search.options.prefetch is local-only and not forwarded into the provider
SDK. It accepts provider, maxUrls, ttlMs, and contentsOptions, and
starts a background page-extraction workflow only when prefetch.provider is
set. /web-providers can also persist default search prefetch settings under
toolSettings.search.prefetch.
web_contents
Read the main text from one or more web pages. It reuses cached pages when they match and fetches only missing or stale URLs.
| Parameter | Type | Default | Description |
|---|---|---|---|
urls |
string[] | required | One or more URLs to extract |
options |
object | β | Provider-specific extraction options |
web_contents reuses any matching cached pages already present in the local
content storeβwhether they came from prefetch or an earlier readβand only
fetches missing or stale URLs.
web_answer
Answer one or more questions using web-grounded evidence. When you ask more than one question, the response is grouped into per-question sections.
| Parameter | Type | Default | Description |
|---|---|---|---|
queries |
string[] | required | One or more questions to answer in one call (max 10) |
options |
object | β | Provider-specific options |
Responses are grouped into per-question sections when more than one question is provided.
web_research
Investigate a topic across web sources and produce a longer report. The
provider-specific options stay native to each SDK, and runtime options
override provider configuration when both are set.
| Parameter | Type | Default | Description |
|---|---|---|---|
input |
string | required | Research brief or question |
options |
object | β | Provider-specific options |
options are provider-native and provider-specific. Equivalent concepts can use
different field names across SDKsβfor example Perplexity uses country, Exa
uses userLocation, and Valyu uses countryCode. Runtime options override
provider-native config, but managed tool inputs and tool wiring stay fixed.
The extension accepts local control fields for robustness: requestTimeoutMs,
retryCount, and retryDelayMs on request/response tools, plus
pollIntervalMs, timeoutMs, maxConsecutivePollErrors, and resumeId on
web_research for lifecycle-based research providers. These fields are handled
by the extension and are not forwarded into the provider SDK call.
- Exa and Valyu research support polling, overall deadlines, and resume IDs
but reject
requestTimeoutMsand do not retry non-idempotent job creation. - Perplexity research runs in streaming foreground mode and only supports
requestTimeoutMs,retryCount, andretryDelayMs.
Providers deliver results in one of three modes:
- Silent foreground β no intermediate output; result returned when done.
- Streaming foreground β progress updates while running, but the result is still only usable after the tool finishes.
- Background research β the provider runs in the background; if
interrupted, the run can be resumed later via
resumeId.
Providers
The built-in providers below are thin adapters around official SDKs.
- SDK:
@anthropic-ai/claude-agent-sdk - Uses Claude Code's built-in
WebSearchandWebFetchtools behind a structured JSON adapter - Runs in silent foreground mode
- Supports request-shaping
optionssuch asmodel,thinking,effort, andmaxTurns - Great for search plus grounded answers if you already use Claude Code locally
- API: Cloudflare Browser Rendering REST API
- Uses the
/markdownendpoint to render pages in headless Chrome on Cloudflare's edge network and return clean Markdown - Runs in silent foreground mode
- No external SDK dependency β uses
fetchagainst the REST API - Handles JavaScript-rendered pages, dynamic content, and complex layouts
- Reports per-URL success/failure with structured content entries
- Requires a Cloudflare API token with Browser Rendering permissions and an account ID
- Free tier: 10 min/day browser time, 6 req/min; Workers Paid ($5/mo): 10 hrs/month, 600 req/min
- SDK:
@openai/codex-sdk - Runs in read-only mode with web search enabled
- Runs in silent foreground mode
- Supports request-shaping
web_search.optionssuch asmodel,modelReasoningEffort, andwebSearchMode - Best if you already use the local Codex CLI and auth flow
- SDK:
exa-js - Search, contents, and answer run in silent foreground mode
- Research runs in background research mode and supports
resumeId - Neural, keyword, hybrid, and deep-research search modes
- Inline text-content extraction on search results
- SDK:
@google/genai - Search and answer run in silent foreground mode
- Research runs in background research mode and supports
resumeId - Google Search grounding for answers
- Deep-research agents via Google's Gemini API
- Supports provider-native request options such as
model,config,generation_config, andagent_configdepending on the tool
- SDK:
@perplexity-ai/perplexity_ai web_searchandweb_answerrun in silent foreground modeweb_researchruns in streaming foreground mode (noresumeIdsupport)- Uses Perplexity Search for
web_search - Uses Sonar for
web_answerandsonar-deep-researchforweb_research - Supports provider-specific
web_search.optionssuch ascountry,search_mode,search_domain_filter, andsearch_recency_filter
- SDK:
parallel-web - Runs in silent foreground mode
- Agentic and one-shot search modes
- Page content extraction with excerpt and full-content toggles
- Supports provider-native search and extraction options from the Parallel SDK
- SDK:
valyu-js - Search, contents, and answer run in silent foreground mode
- Research runs in background research mode and supports
resumeId - Web, proprietary, and news search types
- Supports provider-native options such as
countryCode,responseLength, and search/source filters - Configurable response length for answers and research
Custom CLI provider
The custom-cli provider lets you bring your own wrapper command for any
managed tool. Each capability can point at a different local command under
providers["custom-cli"].native.
The repo includes actual wrapper examples under
examples/custom-cli/wrappers/. They are
small bash scripts that use jq for JSON handling. Each one uses a different
backend pattern:
codex --search execforweb_search- Gemini API via
curlforweb_contents claude -pforweb_answer- Perplexity API via
curlforweb_research
Copy the example wrappers into a local ./wrappers/ directory, then configure:
{
"tools": {
"search": "custom-cli",
"contents": "custom-cli",
"answer": "custom-cli",
"research": "custom-cli"
},
"providers": {
"custom-cli": {
"enabled": true,
"native": {
"search": {
"argv": ["bash", "./wrappers/codex-search.sh"]
},
"contents": {
"argv": ["bash", "./wrappers/gemini-contents.sh"]
},
"answer": {
"argv": ["bash", "./wrappers/claude-answer.sh"]
},
"research": {
"argv": ["bash", "./wrappers/perplexity-research.sh"]
}
}
}
}
}
Those example wrappers deliberately use different local CLIs and APIs so you can see several wrapper styles in one setup without extra glue code.
Each capability can also set an optional cwd and env block. Use cwd when
one wrapper must run from a specific directory. Use env for per-command
variables; each value can be a literal string, an environment variable name, or
!command.
web_research runs as a foreground wrapper command, so local polling controls
(pollIntervalMs, timeoutMs, maxConsecutivePollErrors) and resumeId do
not apply to custom-cli.
Wrapper contract:
stdin: one JSON request object withcapabilityplus the per-call managed inputs (query,urls,input,maxResults,options,cwd)stdout: one JSON response objectsearch:{ "results": [{ "title", "url", "snippet" }] }contents/answer/research:{ "text": "...", "summary"?: "...", "itemCount"?: 1, "metadata"?: {} }
stderr: optional progress lines- exit code
0: success - non-zero exit code: failure
See examples/custom-cli/README.md for a
copy-and-pasteable setup, and see
examples/custom-cli/wrappers/ for the actual
wrapper files.
Generic settings
The genericSettings block sets shared execution defaults that apply to all
providers unless overridden in a provider's policy block:
| Field | Default | Description |
|---|---|---|
requestTimeoutMs |
30000 |
Maximum time for a single provider request |
retryCount |
3 |
Retries for transient failures |
retryDelayMs |
2000 |
Initial delay before retrying |
researchPollIntervalMs |
3000 |
How often to poll long-running research jobs |
researchTimeoutMs |
21600000 |
Overall deadline for research before returning |
researchMaxConsecutivePollErrors |
3 |
Consecutive poll failures before stopping |