pikit-web-access
Web search and content extraction for pi
Package details
Install pikit-web-access from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pikit-web-access- Package
pikit-web-access- Version
1.0.1- Published
- Jul 5, 2026
- Downloads
- not available
- Author
- adrianapan
- License
- MIT
- Types
- extension
- Size
- 21.6 KB
- Dependencies
- 5 dependencies · 1 peer
Pi manifest JSON
{
"extensions": [
"./src/index.ts"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pikit-web-access — web search & content extraction for pi.dev
Web search and content extraction for pi. Search the web via Gemini AI, fetch and read web pages, and extract text from PDFs — all from within the agent.
Install
pi install npm:pikit-web-access
[!TIP] Or grab the entire pikit setup, an opinionated pi.dev configuration that includes this extension.
Features
- Web search: Queries Gemini AI with Google Search grounding — returns a synthesized answer with source citations. Uses
gemini-2.5-flash-lite(hardcoded insrc/search.ts), the cheapest model in the 2.5 family that supports thegoogle_searchgrounding tool. - Page fetching: Fetches any URL and extracts clean readable markdown via Readability + Turndown
- PDF extraction: Detects PDFs by URL or content-type and extracts their text — no API key required
- Multi-URL support: Fetch several URLs in parallel in a single call
- Result storage: Large responses are stored and retrievable in full via
get_search_content - Session persistence: Stored results survive
/reloadand are restored on session start
Structure
web-access/
├── package.json
├── README.md
└── src/
├── index.ts — tool registration and session_start restore
├── types.ts — shared interfaces (SearchResult, ExtractedContent, StoredData)
├── config.ts — reads GEMINI_API_KEY from process.env, clear error if missing
├── storage.ts — in-memory result store with session persistence via appendEntry
├── search.ts — web_search via Gemini API with google_search grounding (model: gemini-2.5-flash-lite)
├── extract.ts — fetch pipeline: Readability → Turndown, PDF detection and routing
├── pdf.ts — PDF text extraction via unpdf
└── utils.ts — shared helpers (truncate, errorMessage, abort check, PDF detection)
Configuration
web_search requires a Gemini API key. fetch_content (including PDF) works without one.
Option 1 — reuse your existing pi Gemini key
If you already have GEMINI_API_KEY set in your environment for Gemini models, the extension picks it up automatically — no extra config needed.
Option 2 — add to your shell profile (e.g. ~/.zshrc)
export GEMINI_API_KEY="AIza...your-key-here"
Security note: the key will be in the shell environment, visible by the
bashtool. Avoid runningenvor commands that print the full environment when a model is watching.
Get a free key at: https://aistudio.google.com/apikey
If GEMINI_API_KEY is not set and web_search is called, the tool returns a clear error message with setup instructions.
Tools
web_search
Search the web via Gemini AI with Google Search grounding. Returns a synthesized answer with source citations.
web_search({ query: "TypeScript 5.5 new features" })
web_search({ queries: ["React 19 changes", "Next.js 15 release notes"] })
| Parameter | Description |
|---|---|
query |
Single search query |
queries |
Multiple queries run in parallel |
fetch_content
Fetch one or more URLs and extract readable content as markdown. Automatically handles PDFs.
fetch_content({ url: "https://example.com/article" })
fetch_content({ urls: ["https://url1.com", "https://url2.com"] })
fetch_content({ url: "https://example.com/doc.pdf" })
| Parameter | Description |
|---|---|
url |
Single URL to fetch |
urls |
Multiple URLs fetched in parallel (3 at a time) |
get_search_content
Retrieve full stored content from a previous web_search or fetch_content call. Use when the original response was truncated.
get_search_content({ responseId: "abc123" })
get_search_content({ responseId: "abc123", queryIndex: 1 })
get_search_content({ responseId: "abc123", url: "https://example.com" })
get_search_content({ responseId: "abc123", urlIndex: 2 })
| Parameter | Description |
|---|---|
responseId |
The responseId from a previous web_search or fetch_content call |
queryIndex |
Query to retrieve by index (web_search, default: 0) |
urlIndex |
URL to retrieve by index (fetch_content, default: 0) |
url |
Retrieve content for a specific URL (fetch_content) |
Stored results expire after 1 hour or when a new session starts.