pi-model-aware-compaction

Per-model context-usage thresholds for Pi's built-in auto-compaction, so models with different context windows and performance profiles compact at the right time

Package details

extension

Install pi-model-aware-compaction from npm and Pi will load the resources declared by the package manifest.

$ pi install npm:pi-model-aware-compaction
Package
pi-model-aware-compaction
Version
0.1.4
Published
Apr 5, 2026
Downloads
398/mo · 31/wk
Author
w-winter
License
MIT
Types
extension
Size
19.5 KB
Dependencies
0 dependencies · 1 peer
Pi manifest JSON
{
  "extensions": [
    "extensions/model-aware-compaction/index.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

Model-Aware Compaction for Pi (pi-model-aware-compaction)

Per-model context-usage thresholds for Pi's compaction pipeline, because different models have different context windows and different performance profiles near their context window limits.

This extension nudges Pi's native compaction pipeline at configurable percent-used thresholds, preserving the full built-in UX (loader, queued-message flush, and whichever compaction summary implementation ultimately handles session_before_compact).

Install

From npm:

pi install npm:pi-model-aware-compaction

From the dot314 git bundle (filtered install):

{
  "packages": [
    {
      "source": "git:github.com/w-winter/dot314",
      "extensions": ["extensions/model-aware-compaction/index.ts"],
      "skills": [],
      "themes": [],
      "prompts": []
    }
  ]
}

Requirements

Pi auto-compaction must be enabled in ~/.pi/agent/settings.json:

{ "compaction": { "enabled": true } }

Compatible with compaction-summary extensions that hook session_before_compact, since it triggers Pi's normal compaction pipeline rather than calling ctx.compact() directly. Said differently, this package decides when compaction starts; stock Pi or your summary extension decides what summary gets written.

Configuration

Copy config.json.example to config.json in the extension's directory and edit:

{
  "global": 70,
  "models": {
    "claude-opus-4-6": 85,
    "gpt-5.2*": 75
  }
}
Key Purpose
global Default threshold (percent used) for models without a specific override
models Per-model overrides keyed by model ID; supports * wildcards

Compaction triggers when used% >= threshold.

Tuning reserveTokens

Pi's own auto-compaction triggers when usedTokens > contextWindow - reserveTokens. If that fires before your model-aware threshold, Pi compacts first. To let model-aware thresholds take priority, lower reserveTokens:

{
  "compaction": {
    "enabled": true,
    "reserveTokens": 9000,
    "keepRecentTokens": 15000
  }
}

How it works

After each agent run, the extension checks context usage against the model-specific threshold. When exceeded, it inflates the last assistant message's usage.totalTokens past the context window size, causing Pi's _checkCompaction() to fire its normal pipeline. The inflated value is ephemeral — compaction rebuilds messages from the session file.

That normal pipeline still prepares compaction the usual way, then either stock Pi or any installed session_before_compact override produces the actual summary entry.

This approach preserves the full native compaction UX (loader, summary, queued-message flush) that would be lost by calling ctx.compact() directly.