pi-web-access-lean
Lean web search, URL fetching, code search, GitHub repo cloning, and PDF extraction for Pi coding agent
Package details
Install pi-web-access-lean from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-web-access-lean- Package
pi-web-access-lean- Version
0.1.1- Published
- May 29, 2026
- Downloads
- not available
- Author
- nabsku_
- License
- MIT
- Types
- extension
- Size
- 141.5 KB
- Dependencies
- 5 dependencies · 0 peers
Pi manifest JSON
{
"extensions": [
"./index.ts"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
Pi Web Access Lean
Web search, code search, URL fetching, GitHub repository cloning, and PDF extraction for Pi Coding Agent.
Fork of nicobailon/pi-web-access. Full credit for the original extension, feature design, and implementation goes to the upstream project and its author.
This fork keeps the web-access tools that are commonly useful in coding-agent sessions and keeps the tool surface small.
Upstream credit
This repository is a lean fork of the original pi-web-access extension by Nico Bailon.
The original extension includes the broader feature set: web search, content extraction, curator workflow, Gemini/Web fallback paths, YouTube/video understanding, and related tooling. This fork intentionally removes parts of that surface for a smaller Pi Coding Agent footprint; it is not a replacement for the full upstream package.
Features
web_search: web search through Exa or Perplexitycode_search: code, documentation, and API search through Exa MCP code context, with fallback searchfetch_content: readable markdown extraction for URLs- GitHub repository handling: clone repositories locally instead of scraping rendered HTML
- PDF extraction: extract text PDFs and save markdown output
- HTML extraction: Readability, Next.js RSC parsing, and Jina Reader fallback
- Activity widget for request/response visibility
Design
Pi loads extension tool schemas into the agent context. Large schemas and rarely used workflows increase every prompt, including prompts that never use web access.
This package is built around a smaller default surface:
- three tools: search, code search, fetch content
- no interactive search-review workflow
- no browser-cookie handling
- no video processing pipeline
- no provider paths that require unrelated credentials
- concise tool descriptions
The result is a smaller extension footprint while preserving the main web and code-research paths used by a coding agent.
Benchmarks
Measurement command shape:
PI_CODING_AGENT_DIR="$TMP" \
pi -p --no-session \
--no-skills --no-context-files --no-prompt-templates --no-themes \
--mode json \
--model openai-codex/gpt-5.5 \
--thinking minimal \
"Reply exactly OK"
Measured extension input-token footprint:
- Original
pi-web-access: +1,180 input tokens pi-web-access-lean: +302 input tokens- Reduction: 878 input tokens
Install
From npm:
pi install npm:pi-web-access-lean
Or add it to Pi settings:
{
"packages": ["npm:pi-web-access-lean"]
}
From GitHub:
pi install git:github.com/Nabsku/pi-web-access-lean
From a local checkout, useful while developing:
git clone https://github.com/Nabsku/pi-web-access-lean.git
pi install /path/to/pi-web-access-lean
Requires Pi Coding Agent with extension support.
Configuration
Configuration is read from ~/.pi/web-search.json. All fields are optional.
{
"exaApiKey": "exa-...",
"perplexityApiKey": "pplx-...",
"provider": "auto",
"githubClone": {
"enabled": true,
"maxRepoSizeMB": 350,
"cloneTimeoutSeconds": 30,
"clonePath": "/tmp/pi-github-repos"
},
"shortcuts": {
"activity": "ctrl+shift+w"
}
}
Provider selection:
auto: Exa first, then Perplexity when configuredexa: Exa onlyperplexity: Perplexity only
Environment variables take precedence where supported:
EXA_API_KEYPERPLEXITY_API_KEY
Tools
web_search
Search the web and return an answer with sources.
web_search({ query: "TypeScript best practices 2026" })
web_search({ queries: ["query 1", "query 2"] })
web_search({ query: "AI agent observability", recencyFilter: "week" })
web_search({ query: "React Server Components", domainFilter: ["react.dev"] })
web_search({ query: "Pi Coding Agent extensions", provider: "exa" })
web_search({ query: "benchmark result", includeContent: true })
Parameters:
query/queries: single query or batch of queriesnumResults: results per query, default5, max20recencyFilter:day,week,month, oryeardomainFilter: domains to include; prefix with-to excludeprovider:auto,exa, orperplexityincludeContent: fetch page content for results in the background
code_search
Search for code examples, documentation, APIs, and debugging references.
Uses Exa MCP code-context when available. Falls back to code-focused web search.
code_search({ query: "React useEffect cleanup pattern" })
code_search({ query: "Express middleware error handling", maxTokens: 10000 })
Parameters:
query: programming question, API, library, or debugging topicmaxTokens: context budget, default5000, max50000
fetch_content
Fetch URLs and extract readable content as markdown.
fetch_content({ url: "https://example.com/article" })
fetch_content({ urls: ["https://a.example", "https://b.example"] })
fetch_content({ url: "https://github.com/owner/repo" })
fetch_content({ url: "https://example.com/report.pdf" })
Parameters:
url/urls: single URL/path or multiple URLsforceClone: clone GitHub repositories that exceed the size threshold
Extraction flow
web_search(query)
→ Exa direct API or MCP
→ Perplexity, when configured
fetch_content(url)
→ GitHub URL? clone repository or use GitHub API fallback
→ HTTP fetch
→ PDF? extract text, save markdown to ~/Downloads/
→ HTML? Readability → RSC parser → Jina Reader fallback
→ text/json/markdown? return directly
Commands
Activity monitor
Toggle with Ctrl+Shift+W to see live request/response activity:
─── Web Search Activity ────────────────────────────────────
API "typescript best practices" 200 2.1s ✓
GET docs.example.com/article 200 0.8s ✓
GET blog.example.com/post 404 0.3s ✗
────────────────────────────────────────────────────────────
Development
npm install
npm test
Files
index.ts: extension entry, tools, activity widgetsearch.ts: search routing for Exa and Perplexitycode-search.ts: code/docs search via Exa MCPextract.ts: URL/path routing, HTTP extraction, fallback orchestrationgithub-extract.ts: GitHub URL parsing, clone cache, content generationgithub-api.ts: GitHub API fallback for large repositories and commit SHAsexa.ts: Exa search provider, direct API and MCP proxyperplexity.ts: Perplexity API client with rate limitingpdf-extract.ts: PDF text extraction, saves markdown outputrsc-extract.ts: RSC flight data parser for Next.js pagesutils.ts: shared formatting and error helpersactivity.ts: activity tracking widget