@diegopetrucci/pi-context-cap

A pi extension that caps effective model context windows at 200k tokens for earlier auto-compaction.

Packages

Package details

extension

Install @diegopetrucci/pi-context-cap from npm and Pi will load the resources declared by the package manifest.

npm repo home report

$ pi install npm:@diegopetrucci/pi-context-cap

Package: @diegopetrucci/pi-context-cap
Version: 0.1.8
Published: Jul 25, 2026
Downloads: 582/mo · 300/wk
Author: diegopetrucci
License: MIT
Types: extension
Size: 7.9 KB
Dependencies: 0 dependencies · 2 peers

Pi manifest JSON

{
  "extensions": [
    "index.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

context-cap

A pi extension that treats large-context models as having an effective 200k-token context window, so pi's built-in auto-compaction starts earlier, and avoids the dumb zone.

By default, pi auto-compacts when:

contextTokens > model.contextWindow - reserveTokens

This extension changes the active model's in-memory contextWindow to:

min(originalContextWindow, 200000)

With pi's default reserveTokens of 16,384, models larger than 200k will proactively compact around 183,616 tokens.

Commands

/context-cap status
/context-cap off
/context-cap on
/context-cap toggle

The extension starts enabled by default. Disabling is temporary for the current extension runtime/session; after /reload, /new, /resume, or /fork, the extension starts enabled again.

Install

Standalone npm package

pi install npm:@diegopetrucci/pi-context-cap

Collection package

pi install npm:@diegopetrucci/pi-extensions

GitHub package

pi install git:github.com/diegopetrucci/pi-extensions

Then reload pi:

/reload

Notes

This extension mutates pi's in-memory model metadata only. It does not edit models.json.
The cap affects pi logic that reads model.contextWindow, including auto-compaction thresholding and UI context-window display.
Because pi also uses model.contextWindow for some overflow detection, a request that succeeds above 200k tokens on a larger model may be treated as overflow and retried after compaction. Use /context-cap off if you need the full model window temporarily.