sift-web-tools
Pi agent web search, fetch, and save tools powered by the local sift CLI.
Package details
Install sift-web-tools from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:sift-web-tools- Package
sift-web-tools- Version
0.1.3- Published
- May 5, 2026
- Downloads
- not available
- Author
- anoopkcn
- License
- MIT
- Types
- extension
- Size
- 37.4 KB
- Dependencies
- 0 dependencies · 4 peers
Pi manifest JSON
{
"extensions": [
"./extensions/sift-web-tools"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
sift-web-tools
Adds LLM-callable tools (web_search, web_fetch, web_save, web_artifacts, web_clean) that give pi local-first web access via the sift CLI.
Install
pi install npm:sift-web-tools
For local testing before publishing:
pi install /Users/akc/develop/sift-web-tools
# or for one run only:
pi -e /Users/akc/develop/sift-web-tools
Requires the sift CLI to be installed and available on $PATH; see Prerequisites.
Tools
web_search(query, max_results?)— Runssift search <query> --json(DuckDuckGo by default; SearXNG if configured) and renders the top results as a markdown list with titles, URLs, and snippets.web_fetch(url, max_chars?)— Runssift fetch <url> --jsonand returns the page's primary content as clean markdown, plustitle/final_url/status/kindin the result details.web_save(url, mode?, filename?, force?)— Runssift fetch <url> --out /tmp/sift-web-tools/...and returns the saved local path instead of loading the content into context. Use it for large pages, PDFs, images, media, or files the agent should inspect later withread,grep, orbash.modeisrenderedby default;rawsaves original response bytes.web_artifacts(limit?)— Lists files saved under/tmp/sift-web-tools/, newest first, with paths, sizes, kinds, and modification times. Also available as/web_artifacts [limit](and typo-compatible/web_artifats [limit]).web_clean(older_than_minutes?, all?, dry_run?)— Deletes saved artifacts. By default deletes files older than 1440 minutes; setall: trueto delete everything ordry_run: trueto preview matches. Also available as/web_clean [older_than_minutes|all] [dry-run].
To fetch multiple URLs, the agent issues parallel web_fetch or web_save tool calls in a single turn — sift instances run concurrently (one child process per URL). Artifact listing is read-only; cleanup runs sequentially.
The tools are local: queries and URLs are not forwarded to any third-party API. The agent talks to a child sift process on your machine, which in turn uses curl for the actual HTTP request.
Prerequisites
siftCLI installed and available in the system's$PATH.curlused by sift for transport.pdftotext(optional) only required if you wantweb_fetchto handle PDFs.
Get pre-built binaries
- Get the latest release
- Put the
siftbinary somewhere in your$PATH(e.g.~/.local/bin/or/usr/local/bin/).
Install from source
git clone https://github.com/anoopkcn/siftzig build -Doptimize=ReleaseSafe- and copy
zig-out/bin/siftto~/.local/bin/or/usr/local/bin/.
Configuration
To override the binary location, set SIFT_BIN to a full path:
export SIFT_BIN="$HOME/.local/bin/sift" # or wherever you put it
(Optional) To use SearXNG instead of DuckDuckGo for search, set sift's native env var:
export SIFT_SEARXNG_URL="https://your-searxng.example/search" # Replace the URL with your SearXNG instance's search endpoint
(no extension change needed — sift reads it directly).
Limits
web_searchtruncates the rendered list to roughlymax_results × 1600chars (hard ceiling 30k) to keep the agent's context tidy.web_fetchtruncates tomax_chars(default 20000, max 100000) and appends[truncated, full length=N]when cut.web_savestores artifacts under/tmp/sift-web-tools/and returns only path/size/mode hints to keep context small.web_savefilenames are sanitized, path components are stripped, and an 8-char URL hash is appended to reduce collisions.web_artifactsandweb_cleanoperate only on regular files directly inside/tmp/sift-web-tools/; they do not recurse into subdirectories.web_fetchandweb_savereject non-http(s)schemes (file://,data:, etc.) before spawning sift.- A 30-second timeout is passed to sift via
--timeout. - Execution uses pi's
pi.exec()with the agent abort signal and an outer timeout; cancellation/timeout terminates the child process promptly.
Failure modes
Errors are thrown from the tool execution so pi marks the tool result as failed, with sift's exit code context included:
transport error: ...— exit 3 from sift (curl failed, HTTP 4xx/5xx, response > 50 MB).page requires JavaScript (SPA) — sift cannot render it— exit 4. sift has no JS engine; report and move on rather than retrying.output file exists: ...— exit 5 from sift if an output path collision still occurs.unsupported content type: ...— exit 6 (e.g. PDF withoutpdftotextinstalled).sift returned invalid JSON ...— sift emitted non-JSON in--jsonmode; the message includes a sample of the actual output for debugging.sift binary not found ...— install sift or setSIFT_BIN.