pi-model-aware-compaction
Per-model context-usage thresholds for Pi's built-in auto-compaction, so models with different context windows and performance profiles compact at the right time
Package details
Install pi-model-aware-compaction from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-model-aware-compaction- Package
pi-model-aware-compaction- Version
0.1.4- Published
- Apr 5, 2026
- Downloads
- 398/mo · 31/wk
- Author
- w-winter
- License
- MIT
- Types
- extension
- Size
- 19.5 KB
- Dependencies
- 0 dependencies · 1 peer
Pi manifest JSON
{
"extensions": [
"extensions/model-aware-compaction/index.ts"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
Model-Aware Compaction for Pi (pi-model-aware-compaction)
Per-model context-usage thresholds for Pi's compaction pipeline, because different models have different context windows and different performance profiles near their context window limits.
This extension nudges Pi's native compaction pipeline at configurable percent-used thresholds, preserving the full built-in UX (loader, queued-message flush, and whichever compaction summary implementation ultimately handles session_before_compact).
Install
From npm:
pi install npm:pi-model-aware-compaction
From the dot314 git bundle (filtered install):
{
"packages": [
{
"source": "git:github.com/w-winter/dot314",
"extensions": ["extensions/model-aware-compaction/index.ts"],
"skills": [],
"themes": [],
"prompts": []
}
]
}
Requirements
Pi auto-compaction must be enabled in ~/.pi/agent/settings.json:
{ "compaction": { "enabled": true } }
Compatible with compaction-summary extensions that hook session_before_compact, since it triggers Pi's normal compaction pipeline rather than calling ctx.compact() directly. Said differently, this package decides when compaction starts; stock Pi or your summary extension decides what summary gets written.
Configuration
Copy config.json.example to config.json in the extension's directory and edit:
{
"global": 70,
"models": {
"claude-opus-4-6": 85,
"gpt-5.2*": 75
}
}
| Key | Purpose |
|---|---|
global |
Default threshold (percent used) for models without a specific override |
models |
Per-model overrides keyed by model ID; supports * wildcards |
Compaction triggers when used% >= threshold.
Tuning reserveTokens
Pi's own auto-compaction triggers when usedTokens > contextWindow - reserveTokens. If that fires before your model-aware threshold, Pi compacts first. To let model-aware thresholds take priority, lower reserveTokens:
{
"compaction": {
"enabled": true,
"reserveTokens": 9000,
"keepRecentTokens": 15000
}
}
How it works
After each agent run, the extension checks context usage against the model-specific threshold. When exceeded, it inflates the last assistant message's usage.totalTokens past the context window size, causing Pi's _checkCompaction() to fire its normal pipeline. The inflated value is ephemeral — compaction rebuilds messages from the session file.
That normal pipeline still prepares compaction the usual way, then either stock Pi or any installed session_before_compact override produces the actual summary entry.
This approach preserves the full native compaction UX (loader, summary, queued-message flush) that would be lost by calling ctx.compact() directly.