@bacnh85/pi-web
Pi extension for web search, page extraction, and Firecrawl scraping/crawling.
Package details
Install @bacnh85/pi-web from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:@bacnh85/pi-web- Package
@bacnh85/pi-web- Version
0.1.2- Published
- Jun 26, 2026
- Downloads
- not available
- Author
- bacnh85
- License
- MIT
- Types
- extension
- Size
- 35.4 KB
- Dependencies
- 4 dependencies · 2 peers
Pi manifest JSON
{
"extensions": [
"./index.ts"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pi-web
Pi extension for web search, readable page extraction, and Firecrawl scraping/crawling.
Install
Install the published package from npm:
pi install npm:@bacnh85/pi-web
From this repository checkout, install only this extension package:
cd extensions/pi-web
npm install
cd ../..
pi install ./extensions/pi-web
# or test directly
pi -e ./extensions/pi-web
The package manifest points Pi directly at ./index.ts, so published npm installs and local installs load the same extension entrypoint.
If you manually copy this directory instead of using pi install, run npm install --omit=dev in the copied pi-web directory so readable-content dependencies such as @mozilla/readability are present.
Configuration
Environment lookup order:
- Process environment
- Current working directory
.env.local - Current working directory
.env - Pi global config
.env.local($PI_CODING_AGENT_DIR/.env.localwhen set; otherwise~/.pi/agent/.env.localthen~/.pi/agents/.env.local) - Pi global config
.env($PI_CODING_AGENT_DIR/.envwhen set; otherwise~/.pi/agent/.envthen~/.pi/agents/.env)
Variables:
BRAVE_API_KEY— required for Brave Search.SEARXNG_BASE_URL— optional; defaults tohttp://172.30.55.22:8888.FIRECRAWL_API_URL— optional; defaults tohttps://api.firecrawl.dev/v2.FIRECRAWL_API_KEY— required for hosted Firecrawl, optional for self-hosted instances without auth.FIRECRAWL_TIMEOUT_MS— optional request timeout, default60000.
Secrets are never printed; status reports only show presence/source.
Tools
Brave:
brave_search— search web results, optionally fetch readable result content.brave_content— fetch a URL and extract readable markdown.
SearXNG:
searxng_search— search web results through a configured self-hosted SearXNG metasearch instance.
Firecrawl:
firecrawl_search— search web/news/images, optionally scrape result markdown.firecrawl_scrape— scrape a URL as markdown/html/links/summary/json.firecrawl_map— discover URLs for a site.firecrawl_crawl— start a conservative crawl, optionally polling for results.
Utility:
web_status— show configured provider status without secrets./web-status— command version of the status check.
Practical Guidance
- Use
searxng_searchfirst for general web search, docs lookup, current facts, and source discovery; it is fast, self-hosted, and avoids hosted API costs/rate limits. - Use
brave_searchas a fast hosted fallback when SearXNG results are weak/unavailable, or when an independent search index is useful. Useinclude_contentsparingly. - Use
brave_contentfor fast known-URL article/docs extraction when simple readable markdown is enough. - Use
firecrawl_searchwhen SearXNG/Brave are unavailable, when Firecrawl search is preferred, or when search results should include scraped markdown. - Use
firecrawl_scrapefor known URLs when extraction quality matters, pages are dynamic, links are needed, or structured JSON extraction is requested. - Use
firecrawl_mapbefore crawling to discover candidate URLs and keep crawls targeted. - Use
firecrawl_crawlonly when multiple pages are required; keep limits conservative and use include/exclude paths. - When answering from web content, cite source URLs from result links or metadata.
- Use
web_statuswhen provider configuration is uncertain or a web tool fails due to credentials/config.