@curio-data/pi-intelli-search

Intelligent web research for Pi: search, extract, collate, and cache grounded web context in one tool call.

Package details

← Back

extensionskill

Install @curio-data/pi-intelli-search from npm and Pi will load the resources declared by the package manifest.

npm repo home report

$ pi install npm:@curio-data/pi-intelli-search

Package: @curio-data/pi-intelli-search
Version: 0.4.0
Published: May 5, 2026
Downloads: 256/mo · 256/wk
Author: miah0x41
License: Apache-2.0
Types: extension, skill
Size: 245 KB
Dependencies: 3 dependencies · 3 peers

Pi manifest JSON

{
  "extensions": [
    "./src/index.ts"
  ],
  "skills": [
    "./skills"
  ],
  "image": "https://raw.githubusercontent.com/Curio-Data/pi-intelli-search/main/docs/images/02.png"
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-intelli-search

Intelligent web research for Pi: search, extract, collate, and cache grounded web context in one tool call.

A Pi extension that adds a 5-stage research pipeline (Search, Fetch, Extract, Collate, and Cache Suggest) designed for technical task completion. Per-page LLM extraction compresses raw pages to query-relevant content. It then deduplicates across sources into a concise summary with a persistent .search/ cache.

Features:

🔍 Search: Perplexity Sonar via OpenRouter (one API key, no $50 minimum).
📄 Extract: Per-page LLM extraction compresses ≈50K to ≈3-5K chars.
🔗 Collate: Cross-source deduplication into a focused ≈5K summary.
💾 Cache: Persistent .search/ cache for offline reuse and follow-up.
🎯 Configurable: Swap any pipeline stage to any model Pi supports.
💰 Low cost: Approximately $0.05 per research session with default settings.

Why intelli-search?

Most coding agents handle web research with a simple two-step pattern: fetch URL, then dump raw content into context. Claude Code's WebFetch tool, revealed in its open-sourced CLI, follows exactly this approach. It fetches a page, converts HTML to Markdown (via the Jina Reader API), and hands the full result to the model.

The problem is that a cleaned documentation page is still ≈50K characters. For the default 8 sources, that is ≈400K chars dumped into the agent's context window. The model must simultaneously hold your task, the codebase, and a wall of raw web content. Signal-to-noise drops fast.

intelli-search takes a different approach: extract before you collate.

Each page is compressed by a dedicated extraction model before entering the agent's context. A collation model then deduplicates across extractions. The agent receives a focused ≈5K summary instead of 400K of raw HTML.

	Fetch-and-dump	intelli-search pipeline
Context cost	Approximately 400K chars raw	Approximately 5K chars focused
Noise	Nav, ads, sidebars included	Stripped by extraction
Deduplication	None. Overlapping sources waste tokens	Cross-source dedupe via collation
Cost per session	N/A (no processing)	Approximately $0.05
Offline reuse	No	Cached in `.search/`

Install

From npm (recommended):

pi install npm:@curio-data/pi-intelli-search

From GitHub:

pi install git:github.com/Curio-Data/pi-intelli-search

Local development:

pi install /path/to/pi-intelli-search

On first load, the extension adds Perplexity Sonar models to ~/.pi/agent/models.json under the openrouter provider. This patch approach lets Pi discover Sonar through OpenRouter. No separate Perplexity API account is needed.

Tools

Tool	Description
`intelli_search`	Search via Perplexity Sonar. Returns summary with source URLs.
`intelli_extract`	Per-page LLM extraction. Reduces ≈50K chars to ≈3-5K of relevant content.
`intelli_collate`	Deduplicate and synthesise extractions into a summary. Writes cache.
`intelli_research`	Full pipeline: search, fetch, extract, collate, cache. One call.

Quick Start

Quick Search

intelli_search(query="TypeScript 5.8 release date")

Deep Research

Always provide a focusPrompt. The extraction LLM works best with specific guidance.

intelli_research(
  query="Svelte 5 runes tutorial examples",
  focusPrompt="Extract the core rune concepts ($state, $derived, $effect), their syntax, and how they replace the old reactive declarations. Include migration patterns from Svelte 4."
)

Targeted Research With Domain Restriction

intelli_research(
  query="Cloudflare Workers KV write timeout limits",
  focusPrompt="Extract KV write limits, timeout thresholds, storage limits, and any workarounds for bulk writes. Focus on hard numbers and error messages.",
  maxUrls=3,
  domains=["developers.cloudflare.com"]
)

Comparing Options

intelli_research(
  query="Tailwind CSS vs Vanilla Extract comparison 2026",
  focusPrompt="Extract pros/cons, bundle size benchmarks, DX tradeoffs, and migration costs. Note which claims come from official sources vs blog opinions."
)

Model Configuration

All three pipeline stages use independently configurable models. Defaults are chosen for cost-efficiency, but any model Pi can access works. This includes built-in providers, OpenRouter models, or models from other extensions.

Stage	Default	Config key
Search	`openrouter/perplexity/sonar`	`intelliSearchModel`
Extract	`minimax/MiniMax-M2.7`	`intelliExtractModel`
Collate	`minimax/MiniMax-M2.7`	`intelliCollateModel`

Why OpenRouter For Sonar?

Perplexity Sonar is an excellent search-grounded model, but it is not in Pi's built-in model list. Rather than requiring a separate Perplexity API account (which requires a $50 minimum credit top-up), the extension routes Sonar through OpenRouter. OpenRouter is a unified pay-as-you-go API with no minimum spend. One API key gives you Sonar alongside thousands of other models. On first load, the extension patches ~/.pi/agent/models.json to add Sonar under the openrouter provider so Pi can discover it. This approach has several benefits:

Avoids the Perplexity API $50 minimum. OpenRouter has pay-as-you-go with no minimum spend.
One account, many models. The same OpenRouter key covers Sonar and any other models you might want for extract or collate.
Is non-destructive. The patch merges new models by ID. It never replaces existing OpenRouter models.
Is idempotent. It is safe across extension reloads and updates.

Swapping The Extract And Collate Model

MiniMax M2.7 is the default because it is cheap and effective for extraction and collation. However, you can use any model Pi supports. Override in ~/.pi/agent/settings.json or .pi/settings.json:

Option A: Use A Pi Built-In Provider (auth via /login):

{
  "intelliExtractModel": { "provider": "openai", "model": "gpt-4o-mini" },
  "intelliCollateModel": { "provider": "openai", "model": "gpt-4o-mini" }
}

Option B: Use Another OpenRouter Model (same key, no extra setup):

{
  "intelliExtractModel": { "provider": "openrouter", "model": "google/gemini-2.0-flash-001" },
  "intelliCollateModel": { "provider": "openrouter", "model": "google/gemini-2.0-flash-001" }
}

Option C: Use A Model Provided By Another Extension (for example, Z.Ai or local models):

{
  "intelliExtractModel": { "provider": "zai", "model": "glm-5.1" },
  "intelliCollateModel": { "provider": "zai", "model": "glm-5.1" }
}

The only requirement is that the model is registered in Pi's model registry and has auth configured. Run /login to set up built-in providers, or follow the extension's own setup for extension-provided models.

Model Selection Guidance

For extraction and collation, the ideal model has:

Low cost per token: 8 extractions, 1 collation, and 1 cache suggest per default session.
Good instruction following: Must adhere to extraction prompts precisely.
Sufficient context: Cleaned pages can be ≈50K chars (truncated to extractMaxChars).

Models known to work well for extraction and collation: MiniMax M2.7 (default), Qwen3.5-Flash (≈1M context, ≈$0.26/M output), DeepSeek V4 Flash (≈1M context, ≈$0.28/M output), Gemini 2.0 Flash Lite (≈1M context, ≈$0.30/M output), GPT-4.1 Nano (≈1M context, ≈$0.40/M output).

Required API Keys

With default settings, you need two keys in ~/.pi/agent/auth.json:

{
  "openrouter": { "type": "api_key", "key": "sk-or-v1-..." },
  "minimax": { "type": "api_key", "key": "sk-api-..." }
}

OpenRouter: Used by intelli_search (Perplexity Sonar) and available as an extract or collate alternative.
MiniMax: Used by intelli_extract and intelli_collate (MiniMax M2.7). Only needed if you keep the defaults. Override intelliExtractModel or intelliCollateModel to use a different provider.

Run /login in Pi to set up keys interactively, or edit the file directly.

Pipeline

intelli_research(query)
  ├── Stage 1: Search  -> Perplexity Sonar (via OpenRouter, pi native auth)
  ├── Stage 2: Fetch   -> wreq-js + Defuddle, compared against raw Markdown
  ├── Stage 3: Extract -> configurable model, default: MiniMax M2.7 (parallel)
  ├── Stage 4: Collate -> configurable model, default: MiniMax M2.7 (dedupe + cache)
  └── Stage 5: Cache suggest -> LLM judge finds related previous searches (additive)

All model assignments are configurable. See Model Configuration.

Each page is dual-fetched (HTML to Defuddle versus Markdown endpoint) and scored for quality. Per-page extraction compresses ≈50K chars to ≈3-5K of query-relevant content before collation, keeping the total context manageable (≈24-40K for 8 pages).

For sites with llms-full.txt (Cloudflare, Next.js, Vite), the raw file is downloaded to the cache for offline grep. No LLM processing is needed.

See docs/ARCHITECTURE.md for detailed design decisions.

Cost

Per research session with the default 8 pages: ≈$0.05

Step	Calls	Cost
Search (Sonar)	1	≈$0.02
Fetch (Defuddle + Markdown)	8 parallel pairs	$0.00
Extract (M2.7)	8 parallel	≈$0.03
Collate (M2.7)	1	≈$0.005
Cache suggest (M2.7)	1	≈$0.0002

Costs scale with your chosen extract or collate model. MiniMax M2.7 is the default specifically for its low cost.

Settings

Override defaults in ~/.pi/agent/settings.json or .pi/settings.json:

{
  // Model assignments: see "Model Configuration" section for swap guidance
  "intelliSearchModel": {
    "provider": "openrouter",
    "model": "perplexity/sonar",
  },
  "intelliExtractModel": { "provider": "minimax", "model": "MiniMax-M2.7" },
  "intelliCollateModel": { "provider": "minimax", "model": "MiniMax-M2.7" },

  // Pipeline tuning
  "intelliMaxUrls": 8,
  "intelliCacheDir": ".search",
  "intelliExtractMaxChars": 150000,
  "intelliExtractionMaxTokens": 3000,
  "intelliCollationMaxTokens": 4000,
  "intelliFetchTimeoutMs": 20000,
  "intelliFetchConcurrency": 4,
  "intelliBrowserFingerprint": "chrome_145",
  "intelliLlmsFullSites": {},
}

intelliBrowserFingerprint controls the TLS fingerprint used by wreq-js when fetching pages (defaults to Chrome 145). intelliLlmsFullSites is a map of domain to base URL for sites that provide llms-full.txt files (for example, {"developers.cloudflare.com": "https://developers.cloudflare.com"}). These files are downloaded raw to the cache without LLM processing.

Cache Structure

.search/
├── 2026-04-19-d1-worker-api/
│   ├── report.md               # Collated summary + source index
│   ├── query.txt               # Original search query
│   ├── extractions/            # Per-page LLM extractions (≈3-5K each)
│   │   ├── 01-developers-cloudflare-com.md
│   │   └── 02-developers-cloudflare-com.md
│   └── sources/                # Full page content
│       ├── 01-developers-cloudflare-com.md
│       ├── 02-developers-cloudflare-com.md
│       └── llms-full-developers-cloudflare-com.md
└── .index.json                 # Index of all cached searches

Compatibility

Pi >= 0.69.0: Core functionality (TypeBox 1.x, tools, model registration, settings, working indicator, after_provider_response monitoring).
Gracefully degrades on older versions. Optional features are skipped.

Development

npm install
npm run build        # TypeScript -> dist/
npm test             # Unit tests (104 tests)
npm run test:smoke   # Smoke test

# Test in pi
pi -e ./dist/index.js

# Install as package
pi install /path/to/pi-intelli-search

Documentation

Changelog: Release history.
Architecture: Detailed design decisions and pipeline internals.
Components: Third-party dependencies and license attribution.
Skill guide: Agent-facing usage instructions.
Contributor guide: Coding conventions and project structure.

Sponsor

This project recognises the support and resources provided by Curio Data Pro Ltd, a data consultancy serving engineering sectors including Rail, Naval Design, Aviation, and Offshore Energy. Curio Data Pro combines 20+ years of Chartered Engineer experience across Aerospace, Defence, Rail, and Offshore Energy with data science and DevOps capabilities.

Blog | LinkedIn

License

Licensed under the Apache License, Version 2.0.

Use of Text Generators

Text Generators (for example, Large Language Models or so-called "Artificial Intelligence" tools) have been used extensively in the development of this project.

Pi agent (primary development environment).
GLM 5.1: Primary model for code generation and architecture.
Qwen 3.6 Plus: Secondary model for review and documentation.