pi-blackhole

Unified compaction + observational memory extension for Pi — compresses conversation context while preserving durable observations and reflections

Packages

Package details

extension

Install pi-blackhole from npm and Pi will load the resources declared by the package manifest.

npm repo home report

$ pi install npm:pi-blackhole

Package: pi-blackhole
Version: 0.3.7
Published: Jun 12, 2026
Downloads: 1,907/mo · 198/wk
Author: k0valik
License: MIT
Types: extension
Size: 412.6 KB
Dependencies: 0 dependencies · 4 peers

Pi manifest JSON

{
  "extensions": [
    "./index.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-blackhole

[!IMPORTANT] Blackhole is the default compaction engine (compactionEngine: "blackhole", compaction: "auto"). Auto-compaction fires at the configured threshold using blackhole's pipeline — both auto-trigger and Pi's /compact command use it. No additional setup needed. To work automatically, blackhole needs to register its hook so it overrides /compact with the blackhole compaction engine. Opting out or opting for manual compaction with blackhole can be done as below:

Setting Auto-trigger after threshold /compact (Pi built-in)

"auto" + "blackhole" (default) blackhole handles ✓ blackhole handles

"auto" + "pi-default" Pi handles Pi handles

"manual" + any skipped Pi handles ✓

"off" + any skipped (Pi handles) Pi handles ✓

The /blackhole command always uses blackhole's pipeline regardless of settings.

Upgrading from an older version? This version replaces legacy keys (passive, noAutoCompact, overrideDefaultCompaction) with compaction, compactionEngine, and tailBehavior. Automatic migration runs at startup — old configs continue to work. Best effort was made to preserve existing behavior, but review MIGRATION-GUIDE.md if something behaves differently.

See CONFIG.md for the full reference.

Setting	Auto-trigger after threshold	`/compact` (Pi built-in)
`"auto"` + `"blackhole"` (default)	blackhole handles ✓	blackhole handles
`"auto"` + `"pi-default"`	Pi handles	Pi handles
`"manual"` + any	skipped	Pi handles ✓
`"off"` + any	skipped (Pi handles)	Pi handles ✓

Algorithmic compaction + session-aware observational memory for Pi — in one unified extension.

Blackhole merges the best ideas from pi-vcc and pi-observational-memory into something that's become its own beast entirely. Deterministic compaction that costs nothing. A memory layer that survives compactions. Per-worker model fallback chains with persisted cooldowns. Manual flush mode. All configured from one JSON file.

Why this exists: I liked both extensions but they fought each other — OM hooked into Pi's default compaction and blocked vcc from working. So I merged them, made them share a single hook and output, and added everything both were missing: fallback chains, cooldowns, a memory toggle, and a manual mode for people who want to control when context gets compressed.

The codebase has since diverged heavily from both upstreams, but tries to keep up-to-date with any fixes from them.

📖 See CHANGELOG.md for release history. ⚙️ See CONFIG.md for the full configuration reference. 🔄 See MIGRATION-GUIDE.md if upgrading from an older version. 📜 See OLD_CONFIG.md for the legacy config documentation.

Quick start

# Install from npm (recommended)
pi install npm:pi-blackhole

# Or directly from GitHub
pi install git:github.com/k0valik/pi-blackhole

If you have standalone pi-vcc or pi-observational-memory installed, remove them first — they conflict and will prevent blackhole from working. You don't loose any features from either extension:

pi uninstall npm / git:https://github.com/sting8k/pi-vcc
pi uninstall npm / git:https://github.com/elpapi42/pi-observational-memory

Then /reload or restart Pi.

Automated setup

Pass llms.txt to your agent and it will walk you through configuration step by step — no need to read all the docs.

Lockstep with upstreams

pi-blackhole tracks both upstream repositories via a lockstep audit system. Every new commit from pi-vcc and pi-observational-memory is classified as safe-to-port, modified (needs review), rewritten (skip), or orphan (needs mapping). Bugfixes and compatible improvements get ported; intentional divergences stay. Nothing is blindly merged — every ported change is reviewed per-commit with human approval. See .pi/skills/lockstep/ for the full workflow.

Demo

/blackhole collapses ~143k tokens of conversation into a ~6.3k structured summary (YMMV based on your settings). /blackhole-memory shows pipeline status. /blackhole-recall searches history the agent can also reach via its recall tool and incrementally search previous conversation history.

https://github.com/user-attachments/assets/a7dd804d-6aca-4bdb-8b6e-0dd779363a43

The problem it solves

Long engineering sessions degrade. Pi's native compaction calls an LLM to write free-form prose summaries — then compacts those summaries, then compacts those summaries. After enough cycles, load-bearing details vanish: why a decision was made, what approaches were already rejected, what the user clarified earlier.

The session is still alive. The agent is no longer carrying the real context.

The two upstream projects each solve one half:

pi-vcc replaces Pi's LLM-based compaction with a deterministic, zero-cost algorithmic summary. Fast, reproducible, no hallucination risk. But repeated compactions still erode detail — it's still a summary.
pi-observational-memory captures timestamped observations and durable reflections in a session ledger that survives across compactions. But its compaction path still calls an LLM — costing money and risking drift on every compact.

pi-blackhole puts vcc in the compaction slot and OM in the memory layer, where each does what it's designed for.

How it works

When you run /blackhole (or when auto-compaction fires), two things happen in one shot:

The vcc pipeline analyzes the transcript tail and produces a structured summary: session goal, file changes, commits, outstanding blockers, user preferences, and a rolling brief transcript.
Observational memory injection renders accumulated observations and reflections from the session ledger and appends them below the summary.

The agent receives a deterministic recap of recent work plus durable facts from the full session history — in a single replacement block. No LLM was called for the compaction itself.

The three memory workers

Three background workers (separate LLM calls) run automatically during the session (when memory: true, which is the default):

Observer — reads conversation since the last observation marker and extracts timestamped facts: events, decisions, preferences. Input is capped to observerChunkMaxTokens newest-first to prevent context blowup on long sessions. Runs most frequently.
Reflector — distills new observations into durable reflections: stable facts, patterns, and constraints that survive future compactions. Runs less often.
Dropper — prunes low-value observations from active memory when the pool exceeds observationsPoolMaxTokens, while keeping reflections and other long-term elements safely in the session ledger.

[Conversation turn] ──> (accumulated tokens >= observeAfterTokens)
                            │
                            v
                    1. OBSERVER
                       (extracts timestamped observations via agent loop)
                            │
                            v
                    2. REFLECTOR
                       (synthesizes durable reflections via agent loop)
                            │
                            v
                    3. DROPPER
                       (prunes low-value observations, keeps reflections)

Each worker uses an agentLoop with tool-calling capabilities — they don't just make a single LLM call. The observer, for example, can call record_observations multiple times per run to work through a chunk incrementally.

Graceful degradation

If any stage fails (model error, rate limit, timeout), remaining stages are skipped and the full pipeline retries on the next agent_start or turn_end. A 30-second retry gate prevents hammering failing APIs. Within each stage, the runtime tries all configured fallback models before giving up — each failed model is cooled down and skipped in subsequent attempts.

What the agent sees after compaction

After compaction, the agent sees something like this (sections appear only when relevant — a session with no git commits won't show [Commits]):

[Session Goal]
- Fix the authentication bug in login flow
- [Scope change]
- Also update the session token refresh logic

[Files And Changes]
- Modified: src/auth/session.ts
- Created: tests/auth-refresh.test.ts

[Commits]
- a1b2c3d: fix(auth): refresh token after password reset

[Outstanding Context]
- lint check still failing on line 42

[User Preferences]
- Prefer Vietnamese responses
- Always run tests before committing

[user]
Fix the auth bug...

[assistant]
Root cause is a missing token refresh...
...transcript continues...

---

---
The conversation before this point has been compacted into the summary above.
Details not captured here — exact code, error messages, file paths — are only recoverable via `recall`.
Use `recall` to search the session history. Do not redo work already completed.

## Reflections
[c3d4e5f6a1b2] User is building Acme Dashboard on Next.js 15 with Supabase auth.

## Observations
[a1b2c3d4e5f6] 2026-05-23 [high] User decided to switch from REST to GraphQL; motivation was reducing over-fetching.
[b2c3d4e5f6a1] 2026-05-23 [medium] GraphQL migration completed; user confirmed working.

----
Bracketed ids in reflections and observations connect to their source session entries. These are condensed memories from earlier in this session.
When entries conflict, the most recent observation reflects the latest known state.
Use `recall` with an id to retrieve original context, or `#N:path` drill-down to explore file content from referenced entries.
When exact source context is needed for precision or traceability, use the `recall` tool with the relevant observation or reflection id. This is especially useful when a reflection materially affects a decision or is too compressed to continue confidently.
----

Note: The OM injection format uses ## Reflections and ## Observations Markdown headers followed by a brief footer. Each observation and reflection has a 12-char hex identifier you can use with the recall tool to recover source evidence, as well as the agent can search based on them and get relevant context back. When no observations or reflections exist, only a short recall-guidance footer is appended.

Compaction modes

Two modes, one shared goal: keep your agent's context sharp without manual housekeeping.

Auto mode (default): install and forget. Workers run, observations are appended as invisible conversation markers, compaction fires automatically when tokens exceed threshold.
Manual mode (compaction: "manual" — the maintainer's daily driver): same workers, same pipeline. But observations go to per-session disk buffers and compaction only happens when you run /blackhole. Cleaner conversation, manual schedule.

The tradeoff is simplicity vs cleanliness:

	Auto (default)	Manual (`compaction: "manual"`)
Workers run?	Yes	Yes
Observations go to	Conversation markers (invisible in TUI)	Disk (`<sessionId>-pending.json`)
Observations accumulate across runs	Branch markers (replaced each cycle)	Pending batches accumulated — `/blackhole-memory` shows pending counts
Auto-compact on `agent_end`	Yes	No
`/blackhole`	Optional — use it whenever you want	Required to flush + compact
Conversation history	OM marker entries between turns (they exist but don't clutter the display)	Clean — nothing between turns
Use case	"I don't want to think about it"	"I want to control when context gets compressed"

Does /blackhole work like a single /compact that Just Works?

Yes, that's exactly the idea, especially in manual mode. When you feel context is getting full or accuracy is slipping, type /blackhole. It flushes any accumulated observations from disk, runs algorithmic vcc compaction (zero LLM cost), and injects your durable reflections into the replacement block. One command, everything gets compressed while keeping your session memory alive.

The difference from Pi's built-in /compact:

/compact calls an LLM to write a free-form summary — costly, lossy, no memory layer.
/blackhole uses algorithmic section extraction (goals, files, commits, preferences...) plus injects observations/reflections from the session ledger. No LLM involved in the compaction itself. Fast, deterministic, memory-preserving.

Fully disabled

Set compaction: "off" and memory: false (or the environment variable PI_BLACKHOLE_PASSIVE=true which sets both) to completely disable all background workers and blackhole's auto-compaction trigger. Pi handles auto-compaction normally. Explicit /blackhole still uses blackhole's pipeline. This is useful for debugging or if you want manual-only blackhole involvement.

Without observational memory (vcc-only)

Set memory: false or run /blackhole om-off for pure vcc compaction — no background workers, no memory injection. The compaction still uses the algorithmic vcc pipeline (not Pi's LLM-based compaction). Re-enable with /blackhole om-on or setting memory: true.

This is a lighter alternative to compaction: "off": workers are off but blackhole's compaction engine still handles compaction.

Commands

Command	What it does
`/blackhole`	Compact the conversation. Subcommands: `configure` (settings overlay), `om-off` / `om-on` toggle observational memory.
`/blackhole-memory` (or `status`)	Pipeline status: token progress, observation/reflection counts, pending data, last errors
`/blackhole-memory view`	Show visible observations and reflections (after compaction trimming), copied to clipboard
`/blackhole-memory full`	Show ALL recorded memory (including dropped observations), copied to clipboard
`/blackhole-recall <query>`	Search session history. Supports `page:N`, `scope:all`, `mode:file

Tools

The agent gets a unified recall tool that handles three types of input:

Input	What it does
`[12-char hex]`	Recover source evidence for a specific observation or reflection ID from the session ledger
`#N`	Expand a session entry by index (show full content)
`#N:path`	Drill-down into file content from a tool call (e.g. `#42:auth.ts` shows first 30 lines; `#42:auth.ts:30` shows next 30; `#42:auth.ts:full` shows everything)
Free text	BM25-ranked OR search across transcript and/or file content. Rare terms weighted higher.
`mode:file`	Search only write/edit file content
`mode:touched`	Aggregate all files written/edited, grouped by path with entry indices
Regex	Pattern search (e.g. `fork.*pi-vcc`, `hook
`scope:all`	Search across all session lineages, not just the active one

Configuration

All settings in a single JSON file: ~/.pi/agent/pi-blackhole/pi-blackhole-config.json — auto-created with defaults on first startup. See CONFIG.md for the full reference with detailed explanations for every knob. An annotated example config is at example-config.json.

Quick start — just set custom models (if you want):

{
  "observerModel":  { "provider": "openrouter", "id": "qwen/qwen3-next-80b-a3b-instruct:free" },
  "reflectorModel": { "provider": "cerebras",   "id": "gpt-oss-120b" },
  "dropperModel":   { "provider": "cerebras",   "id": "gpt-oss-120b" }
}

Everything else has sensible defaults.

Settings at a glance

Setting	Default	What it controls
`compaction`	`"auto"`	When compaction triggers: `"auto"` (blackhole auto-fires), `"manual"` (only `/blackhole`), `"off"` (Pi handles auto + `/compact`, `/blackhole` still works)
`compactionEngine`	`"blackhole"`	Which engine handles auto-compaction: `"blackhole"` or `"pi-default"`. Only meaningful when `compaction: "auto"` — for `"manual"`/`"off"` the hook lets Pi handle everything except `/blackhole`
`tailBehavior`	`"minimal"`	How much stays visible after compaction: `"minimal"` (last user message only, default) or `"pi-default"` (gentle, ~20k tokens). Both `/blackhole` and auto-triggered default to `"minimal"`; set explicitly to opt into gentler cut
`memory`	`true`	`false` = OM workers off + no memory injection (compaction still runs)
`model`	—	Base fallback model for all workers (last resort before session model)
`observerModel` / `observerFallbackModels`	— / `[]`	Primary + fallback models for observer (extracts facts)
`reflectorModel` / `reflectorFallbackModels`	— / `[]`	Primary + fallback models for reflector (synthesizes reflections)
`dropperModel` / `dropperFallbackModels`	— / `[]`	Primary + fallback models for dropper (prunes observations)
`sessionFallback`	`true`	When false, skip session model fallback when all OM model candidates are exhausted. Default true for backward compatibility.
(per model) `thinking`	`"low"`	Thinking/reasoning level: `off`, `minimal`, `low`, `medium`, `high`, `xhigh`
(per model) `cooldownHours`	`1`	How long to skip this model after a retryable error
(per model) `contextWindow`	(inherited from Pi)	Override context window for this model. If unset, inherits from Pi's model registry. When set, the OM pipeline checks if the estimated input fits before calling the model — if not, the next fallback is tried.
`observeAfterTokens`	`15000`	Min accumulated tokens before observer runs
`reflectAfterTokens`	`25000`	Min accumulated tokens before reflector + dropper run
`compactAfterTokens`	`81000`	Auto-compaction threshold (when `compaction: "auto"`)
`observerChunkMaxTokens`	`40000`	Max observer input per run (newest-first)
`observerPreambleMaxTokens`	`0` (auto)	Preamble cap for observer in `compaction: "manual"` mode (auto = 30% of chunk)
`observationsPoolMaxTokens`	`20000`	Max active observation pool before dropper prunes
`observationsPoolTargetTokens`	`10000`	Target size dropper aims for after pruning (derived: half of pool max)
`reflectorInputMaxTokens`	`80000`	Max reflector input budget
`dropperInputMaxTokens`	`80000`	Max dropper input budget
`agentMaxTurns`	`16`	Max agent-loop turns per worker per run
`debug`	`false`	Pre-compaction snapshot to `/tmp/pi-blackhole-debug.json`
`debugLog`	`false`	Continuous JSONL debug log to `~/.pi/agent/pi-blackhole/debug.ndjson`

Environment override: PI_BLACKHOLE_PASSIVE=true sets compaction: "off" + memory: false without touching the config file. Also accepts legacy PI_VCC_OM_PASSIVE / PI_OBSERVATIONAL_MEMORY_PASSIVE.

Configuration presets

The defaults above target a medium-context setup (~128k context window, e.g. GPT-4o, Claude Sonnet). Paste the appropriate block into your config to match your main session model's context size.

Low context (~32k-64k — older models, fast budget models)

{
  "observeAfterTokens": 5000,
  "reflectAfterTokens": 10000,
  "compactAfterTokens": 30000,
  "observerChunkMaxTokens": 15000,
  "observerPreambleMaxTokens": 0,
  "observationsPoolMaxTokens": 8000,
  "reflectorInputMaxTokens": 30000,
  "dropperInputMaxTokens": 30000
}

Medium context (~128k — GPT-4o, Claude Sonnet, Gemini Pro; this is the default)

These are the built-in defaults. If you reset your config, these are what you get:

{
  "observeAfterTokens": 15000,
  "reflectAfterTokens": 25000,
  "compactAfterTokens": 81000,
  "observerChunkMaxTokens": 40000,
  "observerPreambleMaxTokens": 0,
  "observationsPoolMaxTokens": 20000,
  "reflectorInputMaxTokens": 80000,
  "dropperInputMaxTokens": 80000
}

High context (~200k+ — Claude Opus, Gemini Ultra, large local models)

{
  "observeAfterTokens": 20000,
  "reflectAfterTokens": 40000,
  "compactAfterTokens": 180000,
  "observerChunkMaxTokens": 80000,
  "observerPreambleMaxTokens": 0,
  "observationsPoolMaxTokens": 40000,
  "reflectorInputMaxTokens": 160000,
  "dropperInputMaxTokens": 160000
}

What to tune first: compactAfterTokens should be significantly below your model's total context window — aim for ~60-70%. If the agent loses context before compaction fires, lower it. If compaction fires too often and breaks flow, raise it. The other thresholds scale proportionally.

Tip: comments in config

The config preserves unknown keys, so you can add _comment or _notes fields to document your choices inline. They're ignored by the parser.

{
  "_comment": "Tuned for my Cerebras + OpenRouter free model setup",
  "observerModel": { "provider": "openrouter", "id": "qwen/qwen3-next-80b-a3b-instruct:free", "thinking": "low" }
}

Model fallback chains

Each worker has a primary model and an ordered fallback list. On any error — rate limit, timeout, API failure, 5xx — the failed model is cooled down and the next candidate is tried. If all candidates are exhausted, the pipeline aborts and retries on the next trigger event. The session model is always the last resort and is never cooled down.

[Worker fails: 429 / timeout / 5xx / connection error]
         │
         v
  Add model to cooldown list
  (persisted to pi-blackhole-cooldown.json)
         │
         v
  Try next fallback candidate
         │
         v
  [All candidates exhausted?]
         │              │
        yes             no ──> try next
         │
         v
  Fall back to session model (never cooled down)

Cooldowns survive Pi restarts — they're persisted to ~/.pi/agent/pi-blackhole/pi-blackhole-cooldown.json. Each entry records the model identifier, the triggering error, which stage failed, and the expiry timestamp.

Resolution order

For each stage, the runtime builds a candidate list from:

Primary stage model (observerModel, reflectorModel, dropperModel)
Stage fallback models (observerFallbackModels, etc.) — tried in order
Base model (model — shared across all workers)
Session model (the model used for your main conversation — always the last resort)

Models with active cooldowns are transparently skipped. The runtime tries up to 10 model resolutions per stage before giving up entirely.

Example: full resolution chain

With a fully configured setup:

Observer:  qwen3-next-80b (openrouter) → gemma4:31b-cloud (ollama) → gemma-4-31b-it:free (openrouter) → base model → session model
Reflector: gpt-oss-120b (cerebras) → glm-4.7 (z.ai) → gpt-oss-120b:free (openrouter) → base model → session model
Dropper:   gpt-oss-120b (cerebras) → glm-4.7 (z.ai) → gpt-oss-120b:free (openrouter) → base model → session model

Per-model thinking levels

Each model config supports a thinking field that controls reasoning effort:

{
  "observerModel": {
    "provider": "openrouter",
    "id": "qwen/qwen3-next-80b-a3b-instruct:free",
    "thinking": "low",       ← reasoning effort for this specific model
    "cooldownHours": 12       ← custom cooldown duration
  }
}

Valid values: off, minimal, low, medium, high, xhigh. Not all models support every level.

Retryable error detection

The runtime uses a regex to detect retryable errors — it looks for patterns like rate limit, 429, 5xx, timeout, service unavailable, connection error, websocket closed, etc. Non-retryable errors (auth failures, invalid model IDs) immediately skip that candidate and move to the next.

30-second retry gate

After any stage fails completely, the pipeline waits 30 seconds before attempting another consolidation run. This prevents rapid retry loops that would waste API calls on the same failing models.

Recall

Pi's default compaction discards old messages permanently — after compaction, the agent only sees the summary. Blackhole preserves searchable history through two surfaces.

`recall` tool (agent-facing)

The agent gets one unified tool that searches session history, expands entries, drills into file content, and looks up observational memory. Searches read the raw session file directly, bypassing compaction.

Input	What it does
`[12-char hex]`	Recover source evidence for an observation or reflection ID from the session ledger
`#N`	Expand a session entry by index (show full content, not truncated)
`#N:path`	Drill-down into file content from a tool call (e.g. `#42:auth.ts` shows first 30 lines; `#42:auth.ts:30` shows next 30; `#42:auth.ts:full` shows everything)
Free text	BM25-ranked OR search across transcript + file indicators. Rare terms weighted higher.
`mode:file`	Search only write/edit file content
`mode:touched`	Aggregate all files written/edited across the session, grouped by path with entry indices
Regex	Pattern search (e.g. `fork.*pi-vcc`, `hook\|inject`)
`scope:all`	Search across all session lineages (default: active lineage only)

OM coupling: When expanding session entries (#N), the tool automatically looks up related observations and reflections from the session ledger. If any of your expanded entries are referenced as source evidence by an observation, those observations are shown alongside the expanded content.

`/blackhole-recall` command (user-facing)

Results are shown as a collapsible message and auto-fed to the agent as context. Same engine as the recall tool.

/blackhole-recall auth token                        # active-lineage search, ranked
/blackhole-recall auth token page:2                 # paginated (5 results/page)
/blackhole-recall hook|inject                       # regex
/blackhole-recall fail.*build scope:all             # regex across all lineages
/blackhole-recall mode:file                         # search only write/edit file content
/blackhole-recall mode:touched                      # aggregate view of all files touched
/blackhole-recall                                   # recent 25 entries

Details

File drill-down reads the raw session JSONL to extract file content from tool call operations. Supports offset/limit paging so you can browse long files. Note: edit diffs are not indexed for text search — drill-down reads them from the raw session as original full-file writes.

Touched mode (mode:touched) aggregates all files written, edited, or read across the session, grouped by path. Each entry shows which tool operation touched the file and the line count. Useful for getting a lay of the land after a long session.

Feature comparison

	pi-blackhole	pi-vcc	pi-obs-memory	Pi default
Algorithmic compaction (no LLM cost)	✓	✓	—	—
Deterministic output	✓	✓	—	—
Structured summary sections	✓	✓	—	—
Observations + reflections	✓	—	✓	—
Context survives across compactions	✓	—	✓	—
Background memory workers	✓	—	✓	—
Searchable history after compaction	✓	✓	partial	—
Per-worker model config	✓	—	—	—
Fallback model chains + persisted cooldowns	✓	—	—	—
Manual flush mode (`compaction: "manual"`)	✓	—	—	—
Memory toggle (`/blackhole om-off`)	✓	—	—	—
Unified single-file config	✓	—	—	—
Per-session pending state	✓	—	—	—

Uninstall

pi uninstall git:github.com/k0valik/pi-blackhole
rm -rf ~/.pi/agent/pi-blackhole

Credits

pi-blackhole started as a merge of two upstream projects, but has since diverged significantly. The codebase still carries DNA from both:

pi-vcc by @sting8k — algorithmic conversation compaction (the compile() pipeline, section extraction, recall core)
pi-observational-memory by @elpapi42 — session-ledger-based observation/reflection capture, memory agents, ledger folding

What blackhole adds and reworks on top:

Unified configuration — one JSON file, not two
Per-worker model fallback chains with persisted cooldowns that survive Pi restarts
Manual flush mode — compaction: "manual" saves observations to per-session disk buffers
Conflict resolution — OM hooks into vcc's compaction, not Pi's default
Memory toggle (/blackhole om-off / /blackhole om-on) — disable the memory layer without uninstalling
Per-session pending state — isolated per-session JSON files, no cross-session contamination
Custom provider bridge — consolidation agents loaded via jiti can still use provider stream functions registered by other extensions
Retryable error detection with per-model cooldowns — models that fail get cooled down, fallbacks tried automatically, 30-second retry gate prevents spam
Improved observer/reflector/dropper prompts — each heavily customized with detailed extraction rules, relevance guidance, and error handling
OM-recall coupling — when expanding session entries via recall, related observations and reflections are automatically shown
Thinking level support — per-model thinking field for reasoning effort control

License

MIT

pi-blackhole

Quick start

Automated setup

Lockstep with upstreams

Demo

The problem it solves

How it works

The three memory workers

Graceful degradation

What the agent sees after compaction

Compaction modes

Fully disabled

Without observational memory (vcc-only)

Commands

Tools

Configuration

Settings at a glance

Configuration presets

Low context (~32k-64k — older models, fast budget models)

Medium context (~128k — GPT-4o, Claude Sonnet, Gemini Pro; this is the default)

High context (~200k+ — Claude Opus, Gemini Ultra, large local models)

Tip: comments in config

Model fallback chains

Resolution order

Example: full resolution chain

Per-model thinking levels

Retryable error detection

30-second retry gate

Recall

recall tool (agent-facing)

/blackhole-recall command (user-facing)

Details

Feature comparison

Uninstall

Credits

License

`recall` tool (agent-facing)

`/blackhole-recall` command (user-facing)