@curio-data/pi-intelli-search
Intelligent web research for Pi: search, extract, collate, and cache grounded web context in one tool call.
Package details
Install @curio-data/pi-intelli-search from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:@curio-data/pi-intelli-search- Package
@curio-data/pi-intelli-search- Version
0.4.0- Published
- May 5, 2026
- Downloads
- 256/mo ยท 256/wk
- Author
- miah0x41
- License
- Apache-2.0
- Types
- extension, skill
- Size
- 245 KB
- Dependencies
- 3 dependencies ยท 3 peers
Pi manifest JSON
{
"extensions": [
"./src/index.ts"
],
"skills": [
"./skills"
],
"image": "https://raw.githubusercontent.com/Curio-Data/pi-intelli-search/main/docs/images/02.png"
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pi-intelli-search
Intelligent web research for
Pi: search, extract, collate, and cache grounded web context in one tool call.
A Pi extension that adds a 5-stage research pipeline (Search, Fetch, Extract, Collate, and Cache Suggest) designed for technical task completion. Per-page LLM extraction compresses raw pages to query-relevant content. It then deduplicates across sources into a concise summary with a persistent .search/ cache.
Features:
- ๐ Search: Perplexity Sonar via OpenRouter (one API key, no $50 minimum).
- ๐ Extract: Per-page LLM extraction compresses โ50K to โ3-5K chars.
- ๐ Collate: Cross-source deduplication into a focused โ5K summary.
- ๐พ Cache: Persistent
.search/cache for offline reuse and follow-up. - ๐ฏ Configurable: Swap any pipeline stage to any model
Pisupports. - ๐ฐ Low cost: Approximately $0.05 per research session with default settings.
Why intelli-search?
Most coding agents handle web research with a simple two-step pattern: fetch URL, then dump raw content into context. Claude Code's WebFetch tool, revealed in its open-sourced CLI, follows exactly this approach. It fetches a page, converts HTML to Markdown (via the Jina Reader API), and hands the full result to the model.
The problem is that a cleaned documentation page is still โ50K characters. For the default 8 sources, that is โ400K chars dumped into the agent's context window. The model must simultaneously hold your task, the codebase, and a wall of raw web content. Signal-to-noise drops fast.
intelli-search takes a different approach: extract before you collate.
Each page is compressed by a dedicated extraction model before entering the agent's context. A collation model then deduplicates across extractions. The agent receives a focused โ5K summary instead of 400K of raw HTML.
| Fetch-and-dump | intelli-search pipeline | |
|---|---|---|
| Context cost | Approximately 400K chars raw | Approximately 5K chars focused |
| Noise | Nav, ads, sidebars included | Stripped by extraction |
| Deduplication | None. Overlapping sources waste tokens | Cross-source dedupe via collation |
| Cost per session | N/A (no processing) | Approximately $0.05 |
| Offline reuse | No | Cached in .search/ |
Install
From npm (recommended):
pi install npm:@curio-data/pi-intelli-search
From GitHub:
pi install git:github.com/Curio-Data/pi-intelli-search
Local development:
pi install /path/to/pi-intelli-search
On first load, the extension adds Perplexity Sonar models to ~/.pi/agent/models.json under the openrouter provider. This patch approach lets Pi discover Sonar through OpenRouter. No separate Perplexity API account is needed.
Tools
| Tool | Description |
|---|---|
intelli_search |
Search via Perplexity Sonar. Returns summary with source URLs. |
intelli_extract |
Per-page LLM extraction. Reduces โ50K chars to โ3-5K of relevant content. |
intelli_collate |
Deduplicate and synthesise extractions into a summary. Writes cache. |
intelli_research |
Full pipeline: search, fetch, extract, collate, cache. One call. |
Quick Start
Quick Search
intelli_search(query="TypeScript 5.8 release date")
Deep Research
Always provide a focusPrompt. The extraction LLM works best with specific guidance.
intelli_research(
query="Svelte 5 runes tutorial examples",
focusPrompt="Extract the core rune concepts ($state, $derived, $effect), their syntax, and how they replace the old reactive declarations. Include migration patterns from Svelte 4."
)
Targeted Research With Domain Restriction
intelli_research(
query="Cloudflare Workers KV write timeout limits",
focusPrompt="Extract KV write limits, timeout thresholds, storage limits, and any workarounds for bulk writes. Focus on hard numbers and error messages.",
maxUrls=3,
domains=["developers.cloudflare.com"]
)
Comparing Options
intelli_research(
query="Tailwind CSS vs Vanilla Extract comparison 2026",
focusPrompt="Extract pros/cons, bundle size benchmarks, DX tradeoffs, and migration costs. Note which claims come from official sources vs blog opinions."
)
Model Configuration
All three pipeline stages use independently configurable models. Defaults are chosen for cost-efficiency, but any model Pi can access works. This includes built-in providers, OpenRouter models, or models from other extensions.
| Stage | Default | Config key |
|---|---|---|
| Search | openrouter/perplexity/sonar |
intelliSearchModel |
| Extract | minimax/MiniMax-M2.7 |
intelliExtractModel |
| Collate | minimax/MiniMax-M2.7 |
intelliCollateModel |
Why OpenRouter For Sonar?
Perplexity Sonar is an excellent search-grounded model, but it is not in Pi's built-in model list. Rather than requiring a separate Perplexity API account (which requires a $50 minimum credit top-up), the extension routes Sonar through OpenRouter. OpenRouter is a unified pay-as-you-go API with no minimum spend. One API key gives you Sonar alongside thousands of other models. On first load, the extension patches ~/.pi/agent/models.json to add Sonar under the openrouter provider so Pi can discover it. This approach has several benefits:
- Avoids the Perplexity API $50 minimum. OpenRouter has pay-as-you-go with no minimum spend.
- One account, many models. The same OpenRouter key covers Sonar and any other models you might want for extract or collate.
- Is non-destructive. The patch merges new models by ID. It never replaces existing OpenRouter models.
- Is idempotent. It is safe across extension reloads and updates.
Swapping The Extract And Collate Model
MiniMax M2.7 is the default because it is cheap and effective for extraction and collation. However, you can use any model Pi supports. Override in ~/.pi/agent/settings.json or .pi/settings.json:
Option A: Use A Pi Built-In Provider (auth via /login):
{
"intelliExtractModel": { "provider": "openai", "model": "gpt-4o-mini" },
"intelliCollateModel": { "provider": "openai", "model": "gpt-4o-mini" }
}
Option B: Use Another OpenRouter Model (same key, no extra setup):
{
"intelliExtractModel": { "provider": "openrouter", "model": "google/gemini-2.0-flash-001" },
"intelliCollateModel": { "provider": "openrouter", "model": "google/gemini-2.0-flash-001" }
}
Option C: Use A Model Provided By Another Extension (for example, Z.Ai or local models):
{
"intelliExtractModel": { "provider": "zai", "model": "glm-5.1" },
"intelliCollateModel": { "provider": "zai", "model": "glm-5.1" }
}
The only requirement is that the model is registered in Pi's model registry and has auth configured. Run /login to set up built-in providers, or follow the extension's own setup for extension-provided models.
Model Selection Guidance
For extraction and collation, the ideal model has:
- Low cost per token: 8 extractions, 1 collation, and 1 cache suggest per default session.
- Good instruction following: Must adhere to extraction prompts precisely.
- Sufficient context: Cleaned pages can be โ50K chars (truncated to
extractMaxChars).
Models known to work well for extraction and collation: MiniMax M2.7 (default), Qwen3.5-Flash (โ1M context, โ$0.26/M output), DeepSeek V4 Flash (โ1M context, โ$0.28/M output), Gemini 2.0 Flash Lite (โ1M context, โ$0.30/M output), GPT-4.1 Nano (โ1M context, โ$0.40/M output).
Required API Keys
With default settings, you need two keys in ~/.pi/agent/auth.json:
{
"openrouter": { "type": "api_key", "key": "sk-or-v1-..." },
"minimax": { "type": "api_key", "key": "sk-api-..." }
}
- OpenRouter: Used by
intelli_search(Perplexity Sonar) and available as an extract or collate alternative. - MiniMax: Used by
intelli_extractandintelli_collate(MiniMax M2.7). Only needed if you keep the defaults. OverrideintelliExtractModelorintelliCollateModelto use a different provider.
Run /login in Pi to set up keys interactively, or edit the file directly.
Pipeline
intelli_research(query)
โโโ Stage 1: Search -> Perplexity Sonar (via OpenRouter, pi native auth)
โโโ Stage 2: Fetch -> wreq-js + Defuddle, compared against raw Markdown
โโโ Stage 3: Extract -> configurable model, default: MiniMax M2.7 (parallel)
โโโ Stage 4: Collate -> configurable model, default: MiniMax M2.7 (dedupe + cache)
โโโ Stage 5: Cache suggest -> LLM judge finds related previous searches (additive)
All model assignments are configurable. See Model Configuration.
Each page is dual-fetched (HTML to Defuddle versus Markdown endpoint) and scored for quality. Per-page extraction compresses โ50K chars to โ3-5K of query-relevant content before collation, keeping the total context manageable (โ24-40K for 8 pages).
For sites with llms-full.txt (Cloudflare, Next.js, Vite), the raw file is downloaded to the cache for offline grep. No LLM processing is needed.
See docs/ARCHITECTURE.md for detailed design decisions.
Cost
Per research session with the default 8 pages: โ$0.05
| Step | Calls | Cost |
|---|---|---|
| Search (Sonar) | 1 | โ$0.02 |
| Fetch (Defuddle + Markdown) | 8 parallel pairs | $0.00 |
| Extract (M2.7) | 8 parallel | โ$0.03 |
| Collate (M2.7) | 1 | โ$0.005 |
| Cache suggest (M2.7) | 1 | โ$0.0002 |
Costs scale with your chosen extract or collate model. MiniMax M2.7 is the default specifically for its low cost.
Settings
Override defaults in ~/.pi/agent/settings.json or .pi/settings.json:
{
// Model assignments: see "Model Configuration" section for swap guidance
"intelliSearchModel": {
"provider": "openrouter",
"model": "perplexity/sonar",
},
"intelliExtractModel": { "provider": "minimax", "model": "MiniMax-M2.7" },
"intelliCollateModel": { "provider": "minimax", "model": "MiniMax-M2.7" },
// Pipeline tuning
"intelliMaxUrls": 8,
"intelliCacheDir": ".search",
"intelliExtractMaxChars": 150000,
"intelliExtractionMaxTokens": 3000,
"intelliCollationMaxTokens": 4000,
"intelliFetchTimeoutMs": 20000,
"intelliFetchConcurrency": 4,
"intelliBrowserFingerprint": "chrome_145",
"intelliLlmsFullSites": {},
}
intelliBrowserFingerprint controls the TLS fingerprint used by wreq-js when fetching pages (defaults to Chrome 145). intelliLlmsFullSites is a map of domain to base URL for sites that provide llms-full.txt files (for example, {"developers.cloudflare.com": "https://developers.cloudflare.com"}). These files are downloaded raw to the cache without LLM processing.
Cache Structure
.search/
โโโ 2026-04-19-d1-worker-api/
โ โโโ report.md # Collated summary + source index
โ โโโ query.txt # Original search query
โ โโโ extractions/ # Per-page LLM extractions (โ3-5K each)
โ โ โโโ 01-developers-cloudflare-com.md
โ โ โโโ 02-developers-cloudflare-com.md
โ โโโ sources/ # Full page content
โ โโโ 01-developers-cloudflare-com.md
โ โโโ 02-developers-cloudflare-com.md
โ โโโ llms-full-developers-cloudflare-com.md
โโโ .index.json # Index of all cached searches
Compatibility
Pi>= 0.69.0: Core functionality (TypeBox 1.x, tools, model registration, settings, working indicator,after_provider_responsemonitoring).- Gracefully degrades on older versions. Optional features are skipped.
Development
npm install
npm run build # TypeScript -> dist/
npm test # Unit tests (104 tests)
npm run test:smoke # Smoke test
# Test in pi
pi -e ./dist/index.js
# Install as package
pi install /path/to/pi-intelli-search
Documentation
- Changelog: Release history.
- Architecture: Detailed design decisions and pipeline internals.
- Components: Third-party dependencies and license attribution.
- Skill guide: Agent-facing usage instructions.
- Contributor guide: Coding conventions and project structure.
Sponsor
This project recognises the support and resources provided by Curio Data Pro Ltd, a data consultancy serving engineering sectors including Rail, Naval Design, Aviation, and Offshore Energy. Curio Data Pro combines 20+ years of Chartered Engineer experience across Aerospace, Defence, Rail, and Offshore Energy with data science and DevOps capabilities.
License
Copyright 2026 Ashraf Miah, Curio Data Pro Ltd.
Licensed under the Apache License, Version 2.0.
Use of Text Generators
Text Generators (for example, Large Language Models or so-called "Artificial Intelligence" tools) have been used extensively in the development of this project.
Piagent (primary development environment).- GLM 5.1: Primary model for code generation and architecture.
- Qwen 3.6 Plus: Secondary model for review and documentation.

