pi-agentic-compaction

Pi extension for agentic conversation compaction using a virtual filesystem and tool-driven exploration

Package details

← Back

extension

Install pi-agentic-compaction from npm and Pi will load the resources declared by the package manifest.

npm repo home report

$ pi install npm:pi-agentic-compaction

Package: pi-agentic-compaction
Version: 0.3.1
Published: Mar 22, 2026
Downloads: 131/mo · 28/wk
Author: laulauland
License: MIT
Types: extension
Size: 29.4 KB
Dependencies: 1 dependency · 3 peers

Pi manifest JSON

{
  "extensions": [
    "./index.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-agentic-compaction

pi-agentic-compaction is a pi package that replaces pi's default compaction pass with a more agentic one.

Instead of sending the entire conversation to a model in one shot, it mounts the conversation into an in-memory virtual filesystem and lets a summarizer model inspect it with shell tools like jq, grep, head, and tail before producing the final compacted summary.

Why this exists

pi's built-in compaction is simple and effective, but it is still a single-pass summarization step. For long sessions, that means:

the model has to ingest a lot of tokens up front
important details can get buried in the middle of the transcript
you pay for processing context that may not actually matter

This extension takes a different approach:

expose the conversation as /conversation.json in a virtual filesystem
give the summarizer lightweight shell tools
let it inspect only the parts it needs
return the final summary back to pi via session_before_compact

How it works

When pi triggers compaction, this extension:

Reads the messages pi is about to compact
Converts them into LLM-format JSON
Mounts that JSON at /conversation.json using just-bash
Runs a summarizer model with bash/zsh tools over that virtual filesystem
Lets the model explore the conversation before writing the final summary
Returns the summary to pi as a custom compaction result

The extension also adds some deterministic guardrails:

it extracts verified modified files from successful write and edit tool results
it detects no-op edits and excludes them from the modified-files narrative
it supports /compact ... notes and forwards that intent to the summarizer
it can fall back to the currently selected pi model if preferred compaction models are unavailable

Model selection

The agentic compaction loop is a good fit for small, fast models. The task is structured: navigate a JSON file, run a few shell commands, and emit a summary in a defined format. That plays to the strengths of instruction-following models like gpt-5.4-mini — models that are reliable on well-specified tasks, respond quickly, and are cheap enough that multiple tool-call steps do not become a bottleneck.

By default it tries these models, in order:

const COMPACTION_MODELS = [
  { provider: "cerebras", id: "zai-glm-4.7" },
  { provider: "openai", id: "gpt-5.4-mini" },
];

If none are available, it falls back to the current session model.

Steerable compaction

Because the summarizer runs as a separate model in its own agentic loop, its behavior is directly steerable. You can pass guidance via /compact notes:

/compact focus on the authentication changes and unresolved bugs

The note is forwarded into the summarizer's system prompt, biasing both its exploration strategy and its output. Small, instruction-following models tend to respect this kind of explicit steering reliably, which makes the behavior predictable without requiring prompt engineering on the user's part.

Installation

From npm

pi install npm:pi-agentic-compaction

Or add it to ~/.pi/agent/settings.json:

{
  "packages": ["npm:pi-agentic-compaction"]
}

From a local checkout

{
  "packages": ["/path/to/pi-agentic-compaction"]
}

Then reload pi:

/reload

Usage

You generally do not invoke the extension directly.

It runs whenever pi compacts context:

automatically when pi approaches the context limit
manually when you run /compact

You can provide extra guidance to the summarizer:

/compact focus on the authentication changes and unresolved bugs

Configuration

Configuration currently lives in index.ts near the top of the file.

Useful constants include:

const COMPACTION_MODELS = [
  { provider: "cerebras", id: "zai-glm-4.7" },
  { provider: "openai", id: "gpt-5.4-mini" },
];

const DEBUG_COMPACTIONS = false;
const TOOL_RESULT_MAX_CHARS = 50000;
const TOOL_CALL_CONCURRENCY = 6;

Safety and privacy notes

A few relevant details if you plan to use or modify this package:

Conversation data is mounted into an in-memory virtual filesystem for summarization.
The summarizer is explicitly instructed to treat /conversation.json as untrusted input.
Debug logging is off by default.
If you enable DEBUG_COMPACTIONS, compaction inputs, trajectories, and outputs are written to ~/.pi/agent/compactions/, which may include sensitive conversation content.

Trade-offs

The agentic approach has different characteristics from a single-pass summarization, and those trade-offs interact with model size in specific ways.

Pros:

cheaper per compaction for long conversations, since the model reads only what it queries
more targeted inspection of the transcript rather than ingesting everything at once
better file-change awareness than a pure freeform summary
steerable: /compact notes let you direct what the summarizer pays attention to
the structured, tool-use format suits small instruction-following models well

Cons:

may miss details a full-pass summarizer would catch, since exploration is model-driven
requires multiple model/tool steps instead of one call
a smaller model is less likely to self-correct if its exploration strategy is suboptimal
for sessions with subtle cross-cutting context, a larger model may produce more coherent summaries

Accuracy considerations

The trade-offs above are worth thinking through when choosing a compaction model.

The agentic loop structure partially compensates for the limitations of smaller models: the model can re-query sections of the transcript it is uncertain about rather than relying on a single pass over everything. And because the format is well-specified — run some shell commands, write a summary — the task plays to the strengths of models that follow instructions precisely.

That said, small models can struggle when sessions involve nuanced reasoning, implicit dependencies, or ambiguous cause-and-effect chains. If the exploration strategy in the prompt does not surface the right parts of the transcript, a smaller model is less likely to recover from that.

If summary quality on complex sessions matters more than speed or cost, consider updating COMPACTION_MODELS in index.ts to use a larger model. The rest of the extension is model-agnostic.

Package contents

This public repo intentionally keeps the package small:

index.ts — the pi extension
README.md — docs
LICENSE — license text

The published npm package is also restricted to those files via the files field in package.json.

License

MIT