@pi-stef/web
Pi extension for no-key web search, URL fetch, rendered scraping, browser sessions, login flows, and CloakBrowser-backed page access.
Package details
Install @pi-stef/web from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:@pi-stef/web- Package
@pi-stef/web- Version
0.3.1- Published
- Jun 12, 2026
- Downloads
- not available
- Author
- sfiorini
- License
- MIT
- Types
- extension
- Size
- 100.7 KB
- Dependencies
- 14 dependencies · 2 peers
Pi manifest JSON
{
"extensions": [
"./extensions"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
SF Web Access
@pi-stef/web provides no-key web search, URL fetch, browser automation, login sessions, and CloakBrowser-backed page rendering as a Pi extension.
Installation
pi install git:github.com/sfiorini/pi-stef#packages/web
For project-local install:
pi install -l git:github.com/sfiorini/pi-stef#packages/web
After installation, verify runtime dependencies:
pnpm --filter web check-runtime
Install the CloakBrowser browser binary when needed:
pnpm --filter web install-browser
Contents
- Natural Language Usage
- Slash Commands
- Tools
- Configuration
- No-Key Search Setup
- Browser Identity
- Security Notes
- Browser Tests
- License
Natural Language Usage
The agent understands natural-language requests and routes them to the correct tool. Examples:
Search the web:
"Search for current browser automation options and summarize the top sources."
"What's the latest news on DFW weather?"
"Find recent articles about Rust web frameworks."
The agent calls sf_web_search.
Fetch a specific URL:
"Fetch this URL as markdown: https://example.com/docs"
"Read the content of https://example.com and summarize it."
"Use the browser to fetch https://example.com (it needs JavaScript)."
The agent calls sf_web_fetch. Asking to "use the browser" forces mode='browser'.
Run a browser flow:
"Use the browser to go to google.com, search for 'espresso machines', and extract the results."
"Open walmart.com, search for 'laptop deals', and click the first result."
"Navigate to example.com/login, fill in the username and password, and submit."
The agent calls sf_web_flow.
Create a login session:
"Log into https://example.com using my credentials."
"Create a browser login session for the staging site."
The agent calls sf_web_login.
Manage sessions:
"List my browser sessions."
"Clear the 'staging' browser session profile."
The agent calls sf_web_session.
Slash Commands
Slash commands inject a prompt into the agent conversation. The agent then calls the corresponding tool.
| Command | Args | Example |
|---|---|---|
/sf-web-search |
<query> |
/sf-web-search DFW weather forecast |
/sf-web-fetch |
<url> |
/sf-web-fetch https://example.com/docs |
/sf-web-flow |
<instruction> |
/sf-web-flow go to google.com and search for Rust |
/sf-web-login |
<url> |
/sf-web-login https://example.com/login |
/sf-web-session |
[action] [name] |
/sf-web-session list |
/sf-web |
status |
/sf-web status |
If /sf-web-search is already registered by another package, the extension falls back to /sf-search.
Tools
All tools use the sf_web_ prefix to avoid collisions with other Pi extensions.
sf_web_search
Search the web through a no-key provider cascade.
| Parameter | Type | Required | Description |
|---|---|---|---|
query |
string | yes | Search query |
maxResults |
integer (1-20) | no | Maximum results to return (default: 10) |
providers |
string[] | no | Ordered provider list: searxng, searxng-html, duckduckgo, google, bing |
searxngUrl |
string | no | SearXNG instance URL override |
headless |
boolean | no | Headless browser mode (default: true) |
profile |
string | no | Browser profile name for browser-backed providers |
sf_web_fetch
Fetch a specific URL. Defaults to fast HTTP; falls back to CloakBrowser for JS-heavy pages.
| Parameter | Type | Required | Description |
|---|---|---|---|
url |
string | yes | URL to fetch |
format |
string | no | Output format: markdown, text, html, json, raw (default: markdown) |
mode |
string | no | Fetch mode: auto, fast, browser (default: auto) |
headless |
boolean | no | Headless browser mode (default: true) |
profile |
string | no | Browser profile name for rendered fetches |
screenshot |
boolean | no | Capture a screenshot when browser mode is used |
selector |
string | no | CSS selector to extract from HTML pages |
mode values:
auto— tries fast HTTP first, falls back to browser if JS-heavy or blockedfast— HTTP only, no browser fallbackbrowser— always uses CloakBrowser rendering
sf_web_flow
Automate multi-step browser interactions in CloakBrowser. Accepts natural-language instructions or structured step arrays.
| Parameter | Type | Required | Description |
|---|---|---|---|
instruction |
string | no | Natural-language flow instruction |
steps |
array | no | Structured step array (see below) |
headless |
boolean | no | Headless browser mode (default: true) |
profile |
string | no | Browser profile name |
Either instruction or steps must be provided.
Natural-language instruction examples:
"go to google.com and search for espresso machines"
"open example.com then click 'About' then wait 2s"
"navigate to https://example.com/login; type 'user@example.com' in input[name='email']; press Enter"
Structured step actions:
| Action | Fields | Description |
|---|---|---|
goto / navigate / open |
url |
Navigate to URL |
click |
selector, or role + name, or text |
Click an element |
type / fill |
text, optional selector |
Type text into an input |
press / keypress / key |
key, optional selector |
Press a keyboard key |
wait |
ms |
Wait in milliseconds (0-120000) |
screenshot |
path |
Capture screenshot to file |
extract |
selector, optional count |
Extract text content from matching elements |
sf_web_login
Create or refresh a named CloakBrowser login profile. Credentials come from environment variables.
| Parameter | Type | Required | Description |
|---|---|---|---|
url |
string | yes | Login page URL |
profile |
string | no | Browser profile name (default: default) |
interactive |
boolean | no | Open a headed browser for manual login |
interactiveWaitMs |
integer (1000-600000) | no | Wait time for interactive login |
usernameEnv |
string | no | Env var name containing username (default: SF_WEB_USERNAME) |
passwordEnv |
string | no | Env var name containing password (default: SF_WEB_PASSWORD) |
headless |
boolean | no | Headless browser mode (default: true) |
sf_web_session
List, inspect, locate, or clear CloakBrowser session profiles.
| Parameter | Type | Required | Description |
|---|---|---|---|
action |
string | no | list, inspect, locate, or clear (default: list) |
profile |
string | no | Profile name (default: default) |
yes |
boolean | no | Confirm destructive clear action |
Configuration
Configuration is loaded from four layers, merged with later layers overriding earlier:
- Hardcoded defaults
- Config file:
~/.pi/web/config.json(or$SF_WEB_CONFIG) - Environment variables:
SF_WEB_* - Runtime parameters: passed directly to tool calls
Config File
Location: ~/.pi/web/config.json (override with SF_WEB_CONFIG).
{
"maxResults": 10,
"searchProviders": ["searxng", "duckduckgo", "google", "bing"],
"searxngUrl": "https://search.example.com",
"allowPrivateNetworks": false,
"fetchMaxBytes": 2097152,
"fetchTimeoutMs": 15000,
"maxBytes": 51200,
"maxLines": 2000,
"outputDir": "/tmp/sf-web",
"profilesDir": "~/.pi/web/profiles",
"userAgent": "Mozilla/5.0 ...",
"browserFingerprintSeed": "42",
"browserHumanPreset": "default",
"browserLocale": "en-US",
"browserTimezone": "America/New_York",
"browserProxy": "socks5://proxy:1080",
"browserGeoip": false
}
Environment Variables
| Env Var | Config Key | Type | Default | Description |
|---|---|---|---|---|
SF_WEB_CONFIG |
— | string | ~/.pi/web/config.json |
Config file path |
SF_WEB_SEARXNG_URL |
searxngUrl |
string | — | SearXNG instance URL |
SF_WEB_SEARCH_PROVIDERS |
searchProviders |
CSV | searxng,duckduckgo,google,bing |
Ordered search providers |
SF_WEB_MAX_RESULTS |
maxResults |
integer | 10 | Default max search results |
SF_WEB_MAX_BYTES |
maxBytes |
integer | 51200 | Max tool output bytes |
SF_WEB_MAX_LINES |
maxLines |
integer | 2000 | Max tool output lines |
SF_WEB_FETCH_MAX_BYTES |
fetchMaxBytes |
integer | 2097152 | Max HTTP body bytes |
SF_WEB_FETCH_TIMEOUT_MS |
fetchTimeoutMs |
integer | 15000 | HTTP fetch timeout (ms) |
SF_WEB_ALLOW_PRIVATE_NETWORKS |
allowPrivateNetworks |
boolean | false | Allow private/loopback IPs |
SF_WEB_OUTPUT_DIR |
outputDir |
string | $TMPDIR/sf-web |
Output file directory |
SF_WEB_PROFILES_DIR |
profilesDir |
string | ~/.pi/web/profiles |
Browser profiles directory |
SF_WEB_USER_AGENT |
userAgent |
string | Chrome 124 UA | HTTP user-agent (fast fetch only) |
SF_WEB_BROWSER_FINGERPRINT_SEED |
browserFingerprintSeed |
string | auto | Fingerprint seed (positive integer) |
SF_WEB_BROWSER_HUMAN_PRESET |
browserHumanPreset |
string | — | default or careful |
SF_WEB_BROWSER_LOCALE |
browserLocale |
string | — | Browser locale (e.g., en-US) |
SF_WEB_BROWSER_TIMEZONE |
browserTimezone |
string | — | Browser timezone (e.g., America/New_York) |
SF_WEB_BROWSER_PROXY |
browserProxy |
string | — | Proxy URL (http:, https:, or socks5:) |
SF_WEB_BROWSER_GEOIP |
browserGeoip |
boolean | — | Enable GeoIP (requires CloakBrowser support) |
SF_WEB_USERNAME |
— | string | — | Default username for sf_web_login |
SF_WEB_PASSWORD |
— | string | — | Default password for sf_web_login |
SF_WEB_SENSITIVE_QUERY_KEYS |
sensitiveQueryKeys |
CSV | built-in list | Additional URL params to redact from errors |
No-Key Search Setup
Search works without API keys by default. Configure a SearXNG instance for best results:
export SF_WEB_SEARXNG_URL="https://search.example.com"
Provider cascade:
SearXNG JSON → SearXNG HTML → DuckDuckGo HTML/Lite → Google (browser) → Bing (browser)
Each provider is tried in order. The first to return results wins. Google and Bing require CloakBrowser and are last-resort fallbacks.
Browser Identity
Browser mode uses CloakBrowser for fingerprint management. Named profiles get a stable derived fingerprint seed.
For first-run challenges, use headed mode once, complete the challenge manually, then reuse the profile in headless mode.
Browser hardening is config-only (not tool parameters):
| Setting | Config Key | Notes |
|---|---|---|
| Fingerprint seed | browserFingerprintSeed |
Positive integer; auto-derived from profile name if omitted |
| Human preset | browserHumanPreset |
default or careful |
| Locale | browserLocale |
e.g., en-US |
| Timezone | browserTimezone |
e.g., America/New_York |
| Proxy | browserProxy |
http:, https:, or socks5: URL |
| GeoIP | browserGeoip |
Requires CloakBrowser's optional GeoIP support |
Security Notes
- Only
http:andhttps:URLs are allowed. - Private, loopback, link-local, multicast, and reserved IP ranges are blocked by default.
- Fast fetch re-validates redirects before reading the body.
- Browser profiles contain cookies and session tokens. Clear them when no longer needed.
sf_web_logindoes not accept raw passwords. Use interactive login or environment variable names.
Browser Tests
Real browser tests are skipped by default. Run them when intentionally testing CloakBrowser:
SF_WEB_RUN_BROWSER_TESTS=1 pnpm exec vitest run packages/web/tests/browserSmoke.test.ts
SF_WEB_RUN_BROWSER_TESTS=1 pnpm exec vitest run packages/web/tests/tools.e2e.test.ts
License
Runtime dependencies:
| Package | Purpose | License |
|---|---|---|
cloakbrowser |
CloakHQ browser launcher/runtime | MIT |
playwright-core |
Browser automation API | Apache-2.0 |
defuddle |
Clean article extraction | MIT |
@mozilla/readability |
Readability fallback extraction | Apache-2.0 |
turndown |
HTML to Markdown conversion | MIT |
jsdom |
Server-side DOM for extraction | MIT |
@sinclair/typebox |
JSON schema type builder | MIT |