pi-knowledge
Local-first RAG knowledge base for Pi agents with hybrid code/doc search, reranking, diagnostics, and large-project indexing
Package details
Install pi-knowledge from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-knowledge- Package
pi-knowledge- Version
0.4.2- Published
- Jun 18, 2026
- Downloads
- 1,752/mo · 1,752/wk
- Author
- nczz
- License
- MIT
- Types
- extension
- Size
- 177.3 KB
- Dependencies
- 11 dependencies · 0 peers
Pi manifest JSON
{
"extensions": [
"./index.ts"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pi-knowledge
Local-first RAG knowledge base for Pi agents.
Index your codebase, docs, PDFs, URLs, and notes into persistent knowledge bases that Pi can search across sessions. pi-knowledge combines semantic embeddings, BM25 keyword search, code-aware chunking, reranking, diagnostics, and stable large-project indexing so agents can answer from your actual project knowledge instead of guessing.
Built as a native Pi extension, deeply integrated with the agent lifecycle. Designed for local-first project memory, agentic code search, and retrieval-augmented development workflows.
Why It Matters
Agents lose project context between sessions and cannot fit large repositories into one prompt. pi-knowledge gives Pi durable, searchable project memory:
- Search code and documentation by meaning, exact symbols, or both.
- Keep private source data local by default.
- Re-index changed files incrementally.
- Diagnose stale, stuck, or low-quality indexes.
- Handle large repositories with persisted progress instead of silent hangs.
Unlike pi-memory (which manages the agent's own notes), pi-knowledge indexes your existing files and makes them semantically searchable by the agent.
Highlights
- Local-first project memory: stores indexes under
~/.pi/knowledge/; no project files are modified. - Hybrid retrieval: semantic vectors + BM25 keyword search + normalized weighted score fusion.
- Code-aware indexing: AST-aware chunking for TypeScript/JavaScript, Python, Go, Rust, and Java.
- Better agent answers: adaptive context windows, diversity reranking, optional cross-encoder reranking, and diagnostics.
- Large-project stability: persisted indexing progress, capped batches, streaming vector scans, and stuck-job detection.
- Private by default: local embeddings run without an API key.
Feature Comparison
| Feature | pi-knowledge | kiro-cli knowledge | pi-memory |
|---|---|---|---|
| Index arbitrary files/dirs | ✅ | ✅ | ❌ |
| Multiple named knowledge bases | ✅ | ✅ | ❌ |
| Semantic (vector) search | ✅ | ✅ | ✅ (via qmd) |
| BM25 keyword search | ✅ | ✅ | ✅ (via qmd) |
| Hybrid search + weighted score fusion | ✅ | ❌ | partial |
| Cross-encoder reranking | ✅ | ❌ | ❌ |
| Adaptive contextual search | ✅ | ❌ | ❌ |
| Diversity reranking | ✅ | ❌ | ❌ |
| Incremental re-indexing | ✅ | ❌ | ❌ |
| File watcher (auto-update) | ✅ | ❌ | ❌ |
| Code-aware chunking | ✅ (TS/JS/Py/Go/Rust/Java) | ❌ | ❌ |
| Local embeddings (zero API) | ✅ | ❌ | ✅ (qmd) |
| Index quality diagnostics | ✅ | ❌ | ❌ |
| Metadata filters in search | ✅ | ❌ | ❌ |
| Progress reporting + stuck indexing diagnostics | ✅ | partial | ❌ |
| Cross-session persistence | ✅ | ✅ | ✅ |
| Pi extension native | ✅ | N/A | ✅ |
| Context injection per turn | ✅ | ❌ | ✅ |
| Search result TUI rendering | ✅ | N/A | ❌ |
| Full tool TUI rendering | partial | N/A | ❌ |
| RPC mode support | ✅ | N/A | N/A |
TUI rendering currently uses Pi's stable default tool rendering plus targeted result formatting. A fuller custom renderer is intentionally not enabled yet because custom renderers must be width-safe across terminal sizes and Pi TUI modes.
Research-Backed Retrieval
pi-knowledge turns retrieval research into product behavior that agents can actually use:
- RAG-native project memory: follows the Retrieval-Augmented Generation pattern from Lewis et al. 2020: keep source truth outside the model, retrieve it at answer time, and inject only relevant context.
- Dense semantic recall: uses multilingual dense embeddings in the spirit of Dense Passage Retrieval (Karpukhin et al. 2020), so conceptual queries can find code/docs even when wording differs.
- Contextual Retrieval without remote chunk rewriting: applies Anthropic's Contextual Retrieval insight locally by embedding file path, file type, Markdown breadcrumbs, and code symbols with each chunk. This improves standalone chunk meaning without sending private source chunks to an LLM for context generation.
- Hybrid retrieval with diagnosable scores: combines BM25 and vectors with normalized weighted score fusion. RRF (Cormack et al. 2009) remains the baseline reference, but weighted fusion is used by default because project dogfood showed RRF compressed scores too much for ranking diagnostics.
- MMR-style diversity: uses Maximal Marginal Relevance ideas (Goldstein and Carbonell 1998), file interleaving, vector redundancy checks, and adaptive-window overlap collapse so repeated README or same-file chunks do not dominate top results.
- Intent-aware and self-correcting agent UX: mode selection (
auto,fast,semantic,hybrid,adaptive,deep), ranking diagnostics, andknowledge_doctorturn retrieval failures into concrete next actions instead of silent bad answers. - Confidence gating: low-evidence hybrid matches can return zero results instead of unrelated chunks, reducing false confidence when the KB does not contain the answer.
This is intentionally not a heavy ColBERT-style late-interaction index yet (Khattab and Zaharia 2020). The current product chooses lightweight local embeddings, BM25, query-aware ranking, optional cross-encoder reranking, streamed vector scans, and health diagnostics for commercial usefulness with low setup cost.
In project-level dogfood, these changes improved a real codebase evaluation from early 3.x/5 quality to above 4.5/5 after rebuilds, with fixes for score compression, README repetition, garbage-query false positives, small-module discoverability, source-vs-test ranking, indexing stability, and auto-mode false positives. Existing KBs should be rebuilt or updated after upgrades that change indexing text.
Quick Start
# Install
pi install npm:pi-knowledge
# Or from source
pi install ./pi-knowledge
# Index a directory
# (agent will call knowledge_add automatically, or you can ask it)
> Index my project docs at ./docs as "Project Docs"
# Search
> Search my knowledge base for "authentication flow"
# The agent also auto-searches relevant knowledge before answering domain questions
Tools
| Tool | Description |
|---|---|
knowledge_plan |
Inspect an indexing source before writing a KB; reports scannable files, suggested exclusions, and technical skips |
knowledge_add |
Index files, directories, URLs, PDFs, DOCX, or inline text |
knowledge_search |
Hybrid, deep, or adaptive search across one or all knowledge bases |
knowledge_remove |
Remove a knowledge base by name or ID |
knowledge_update |
Incrementally re-index changed files in a knowledge base |
knowledge_show |
List all knowledge bases with stats |
knowledge_status |
Show engine status with health diagnostics (stale, orphans, coverage) |
knowledge_doctor |
Diagnose health score, skipped files, stuck jobs, stale data, and recommended fixes |
knowledge_clear |
Remove all knowledge bases |
knowledge_export |
Export a KB to shareable JSONL file |
knowledge_import |
Import a KB from JSONL (re-embeds content) |
Search Modes
fast: BM25 keyword search for exact symbols, commands, and identifiers.semantic: vector search for conceptual matches.hybrid: BM25 + vector search with normalized weighted score fusion.deep: hybrid retrieval followed by cross-encoder reranking.adaptive: hybrid retrieval followed by query-time contextual window expansion around seed chunks. It keeps the matched seed, prefers nearby/query-relevant neighboring chunks, and collapses overlapping windows from the same file.auto: selects a primary mode from the query shape and retries alternate modes when results are empty or weak.
Mode selection contract:
- Start with
hybridfor most project questions. - Use
fastfor exact symbols, filenames, commands, error codes, API names, config keys, or quoted strings. - Use
semanticwhen the query is conceptual and exact terms may differ from the indexed wording. - Use
adaptivewhen the answer needs nearby code, neighboring documentation sections, or enough context to make a safe edit. - Use
deepfor high-stakes answers, ambiguous top results, or final verification when slower reranking is acceptable. - If results are empty or weak but the KB should contain the answer, retry once with a different mode before concluding no answer exists.
Search results use balanced diversity reranking by default so near-duplicate chunks from the same file do not dominate the top results. Diversity scoring considers lexical overlap, same-file line proximity, overlapping adaptive windows, available embedding-vector similarity, and file-level interleaving. Use diversity: "off" only when raw ranking order is needed for diagnostics. Agents can request search diagnostics to inspect mode fallback, ranking coverage, path/source/test boosts, and adjusted scores.
For best search quality, rebuild or update existing knowledge bases after upgrading. New indexes use contextual retrieval units: embeddings and FTS include file path, file type, Markdown heading breadcrumbs, and code symbol names while returned results keep the original chunk text readable. This improves queries that mention project structure, filenames, sections, or functions, and reduces duplicate-looking chunk hits.
Large Project Indexing
Indexing is designed as a stable long-running operation, not a quick background trick. knowledge_add, knowledge_update, and knowledge_import scan directories incrementally, embed and store chunks in hard-capped batches, stream vector files to disk, and report progress with file/chunk counts, chunks/sec, skipped file counts, elapsed time, and file ETA where available. Directory indexing starts with a metadata-only planning scan so large repositories can show total scannable files and skipped counts before expensive embedding starts.
Directory indexing separates technical skips from user-confirmable suggestions. knowledge_plan inspects a source without writing a KB, so agents can show scannable files, suggested exclusions, and technical skips before asking the user to confirm scope. Unsupported binary/non-text files, oversized files, unreadable files, inaccessible paths, and documents that cannot be extracted are skipped for stability. Text files that may be private or low-signal, such as .env, credential-named files, generated reports, lockfiles, vendor text, build output text, and runtime/cache text, are suggested exclusions by default rather than permanent blocks. Agents should explain the privacy and search-precision tradeoff, ask the user when the choice is ambiguous, and then use include_suggested_text or focused include_paths when the user confirms those text files belong in the KB. Ordinary project configuration files such as settings.json or appsettings.json remain indexable because they often describe real system behavior.
Indexing progress is persisted in SQLite, not only printed as transient tool updates. knowledge_status shows the current or last indexing operation, phase, last progress message, last progress age, processed file/chunk counts, skipped count, and add/remove/unchanged counts. This makes long indexing runs distinguishable from stuck jobs even if the user checks status from a later prompt.
Update and diagnostics paths are also streaming-oriented: changed chunks are embedded in batches, newly produced vectors are written to temporary vector files, deleted rows are removed in batches, and final vector rebuilds iterate SQLite rows instead of loading the whole KB. Search also avoids loading a full KB vector file or all chunk IDs into memory. Semantic and hybrid modes scan vectors from disk and retain only the top candidate vectors needed for ranking/diversity. knowledge_status reports stale files, orphaned chunks, coverage, skipped files, and indexing jobs that appear stuck after an interrupted or crashed Pi process. knowledge_doctor summarizes the same signals as a health score with concrete actions. KBs still marked indexing or error are visible in status but skipped by search, so interrupted work is not treated as a healthy searchable KB. A stuck indexing KB should be removed and rebuilt after confirming no active Pi process is still building it.
Architecture
See DESIGN.md for the full technical design document.
Data Storage
All data is stored globally at ~/.pi/knowledge/ (never in your project directory):
~/.pi/knowledge/
├── knowledge.db ← SQLite (metadata + chunks + FTS5 index)
├── vectors/ ← Embedding vectors per KB (binary)
└── models/ ← Downloaded ONNX models (~32MB, cached)
- Backup: copy
~/.pi/knowledge/directory - Reset: delete
~/.pi/knowledge/to start fresh - Project safety: pi-knowledge is read-only on indexed directories — no files are created or modified in your project
- Updates: extension updates do not affect existing indexed data. Schema migrations run automatically if needed.
Development
npm install
npm test # Unit tests
npm run test:e2e # Smoke integration tests; PDF/DOCX cases skip unless fixture env vars are set
PI_KNOWLEDGE_E2E_PDF=/path/to/file.pdf PI_KNOWLEDGE_E2E_DOCX=/path/to/file.docx npm run test:e2e
npm run bench # Indexing/search benchmarks
PDF/DOCX fixtures should be real local files outside the repository. Do not commit private fixture files, extracted fixture text, snapshots, or machine-specific fixture paths. A release-grade e2e pass requires both fixture env vars; a run with skipped PDF/DOCX cases is only a smoke pass.
Release
Before publishing, update package.json, package-lock.json, and CHANGELOG.md, then run:
npm run check
npm test
npm run test:e2e
PI_KNOWLEDGE_E2E_PDF=/path/to/file.pdf PI_KNOWLEDGE_E2E_DOCX=/path/to/file.docx npm run test:e2e
node --experimental-strip-types -e "import('./index.ts')"
npm pack --dry-run
pi -e ./index.ts
git push origin main
gh release create vX.Y.Z --title "vX.Y.Z" --notes-file /path/to/release-notes.md
npm publish
Report any skipped or unverified gate explicitly. Do not describe smoke e2e as complete release-grade coverage.
License
MIT