@bacnh85/pi-web

Pi extension for web search, page extraction, and Firecrawl scraping/crawling.

Packages

Package details

extension

Install @bacnh85/pi-web from npm and Pi will load the resources declared by the package manifest.

npm repo home report

$ pi install npm:@bacnh85/pi-web

Package: @bacnh85/pi-web
Version: 0.1.2
Published: Jun 26, 2026
Downloads: not available
Author: bacnh85
License: MIT
Types: extension
Size: 35.4 KB
Dependencies: 4 dependencies · 2 peers

Pi manifest JSON

{
  "extensions": [
    "./index.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-web

Pi extension for web search, readable page extraction, and Firecrawl scraping/crawling.

Install

Install the published package from npm:

pi install npm:@bacnh85/pi-web

From this repository checkout, install only this extension package:

cd extensions/pi-web
npm install
cd ../..

pi install ./extensions/pi-web
# or test directly
pi -e ./extensions/pi-web

The package manifest points Pi directly at ./index.ts, so published npm installs and local installs load the same extension entrypoint.

If you manually copy this directory instead of using pi install, run npm install --omit=dev in the copied pi-web directory so readable-content dependencies such as @mozilla/readability are present.

Configuration

Environment lookup order:

Process environment
Current working directory .env.local
Current working directory .env
Pi global config .env.local ($PI_CODING_AGENT_DIR/.env.local when set; otherwise ~/.pi/agent/.env.local then ~/.pi/agents/.env.local)
Pi global config .env ($PI_CODING_AGENT_DIR/.env when set; otherwise ~/.pi/agent/.env then ~/.pi/agents/.env)

Variables:

BRAVE_API_KEY — required for Brave Search.
SEARXNG_BASE_URL — optional; defaults to http://172.30.55.22:8888.
FIRECRAWL_API_URL — optional; defaults to https://api.firecrawl.dev/v2.
FIRECRAWL_API_KEY — required for hosted Firecrawl, optional for self-hosted instances without auth.
FIRECRAWL_TIMEOUT_MS — optional request timeout, default 60000.

Secrets are never printed; status reports only show presence/source.

Tools

Brave:

brave_search — search web results, optionally fetch readable result content.
brave_content — fetch a URL and extract readable markdown.

SearXNG:

searxng_search — search web results through a configured self-hosted SearXNG metasearch instance.

Firecrawl:

firecrawl_search — search web/news/images, optionally scrape result markdown.
firecrawl_scrape — scrape a URL as markdown/html/links/summary/json.
firecrawl_map — discover URLs for a site.
firecrawl_crawl — start a conservative crawl, optionally polling for results.

Utility:

web_status — show configured provider status without secrets.
/web-status — command version of the status check.

Practical Guidance

Use searxng_search first for general web search, docs lookup, current facts, and source discovery; it is fast, self-hosted, and avoids hosted API costs/rate limits.
Use brave_search as a fast hosted fallback when SearXNG results are weak/unavailable, or when an independent search index is useful. Use include_content sparingly.
Use brave_content for fast known-URL article/docs extraction when simple readable markdown is enough.
Use firecrawl_search when SearXNG/Brave are unavailable, when Firecrawl search is preferred, or when search results should include scraped markdown.
Use firecrawl_scrape for known URLs when extraction quality matters, pages are dynamic, links are needed, or structured JSON extraction is requested.
Use firecrawl_map before crawling to discover candidate URLs and keep crawls targeted.
Use firecrawl_crawl only when multiple pages are required; keep limits conservative and use include/exclude paths.
When answering from web content, cite source URLs from result links or metadata.
Use web_status when provider configuration is uncertain or a web tool fails due to credentials/config.