@leonardorick/pi-web-search

Web search tool for pi — Exa MCP search with DuckDuckGo fallback via wreq-js.

Package details

extension

Install @leonardorick/pi-web-search from npm and Pi will load the resources declared by the package manifest.

$ pi install npm:@leonardorick/pi-web-search
Package
@leonardorick/pi-web-search
Version
0.2.2
Published
May 2, 2026
Downloads
433/mo · 433/wk
Author
leonardorick
License
MIT
Types
extension
Size
48.9 KB
Dependencies
1 dependency · 1 peer
Pi manifest JSON
{
  "extensions": [
    "./index.js"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

@leonardorick/pi-web-search

Web search tool for pi — Exa hosted MCP search with DuckDuckGo HTML fallback via wreq-js browser TLS fingerprinting. Companion to pi-smart-fetch.

Why

Pi's built-in web search relies on Anthropic's first-party web_search_20250305 tool, which is only available when running Claude models. This package fills the gap for any other model with a keyless web_search tool.

Search tries Exa's hosted MCP endpoint first (web_search_exa over JSON-RPC). If Exa MCP fails, rate-limits, or returns no results, the tool falls back to DuckDuckGo HTML search.

Bare Node fetch() (even with a spoofed User-Agent header) gets served DDG's anti-bot anomaly page (cc=botnet) because the TLS/HTTP2 fingerprint leaks "this is Node, not Chrome." wreq-js wraps native Rust bindings that emulate real browser TLS/HTTP2 fingerprints — same primitive pi-smart-fetch uses for web_fetch.

Install

// ~/.pi/agent/settings.json
{
  "packages": ["npm:@leonardorick/pi-web-search"],
}

Tool

Registers web_search:

param type required description
query string yes Search query (min length 2)
allowed_domains string[] no Only return results from these domains
blocked_domains string[] no Exclude results from these domains
numResults integer no Result cap. Default 8, min 1, max 20.
freshness string no Time filter. day / week / month / year, or custom range YYYY-MM-DDtoYYYY-MM-DD.
country string no ISO 3166-1 alpha-2 country code for region-localised results (e.g. US, DE, BR).

allowed_domains and blocked_domains are mutually exclusive. The tool translates filters to Exa site: / -site: query terms and also enforces them client-side. It over-fetches when filters are active so post-filter trimming can still meet the requested num_results cap.

Provider support matrix

filter Exa MCP DuckDuckGo (fallback)
allowed_domains yes (via site:) yes (via site:)
blocked_domains yes (via -site:) yes (via -site:)
numResults yes (native) yes (post-filter)
freshness best-effort (may be ignored)¹ yes (df= param)
country best-effort (may be ignored)¹ yes (kl= param)

¹ Exa MCP's documented schema only exposes query and numResults. Extra fields are sent best-effort: harmless if ignored, honored if Exa supports them in the future. The DuckDuckGo fallback enforces them reliably.

Output mirrors Claude Code's Web search results for query: "..." block, including the Links: [...] JSON line and the "REMINDER: include sources" suffix — the model is expected to append a Sources: markdown list to its reply.

Design notes

No prompt argument and no summarization step. Claude Code's WebSearchTool calls Haiku with the user-provided prompt to compress results before returning. Pi's built-in tools (read, grep, etc.) follow a "raw output, model digests it" pattern — this extension matches that convention.

Anonymous Exa MCP access is rate-limited by Exa (currently documented as 2 QPS and 50 tool calls/day per IP). The tool treats Exa as a best-effort first provider and keeps DDG as fallback.

TODO: Add an optional prompt parameter paired with configurable model/provider settings so results can be summarized before returning — matching Claude Code's behavior. The summarizer should use the active pi provider rather than hardcoding a specific model.

Limitations

  • Anonymous Exa MCP quota is shared by client IP; heavy use can hit 429 and fall back to DDG.
  • DDG HTML scraping is best-effort. If DDG changes its result markup the snippet regex is the most likely break point; titles + URLs degrade more gracefully.
  • freshness and country are only guaranteed on the DDG fallback. On Exa MCP they ride along as extra fields and are silently ignored unless Exa expands the schema.
  • This package only returns search results and snippets. Use web_fetch from pi-smart-fetch for page content extraction.

License

MIT