@diegopetrucci/pi-context-cap

A pi extension that caps effective model context windows at 200k tokens for earlier auto-compaction.

Packages

Package details

extension

Install @diegopetrucci/pi-context-cap from npm and Pi will load the resources declared by the package manifest.

$ pi install npm:@diegopetrucci/pi-context-cap
Package
@diegopetrucci/pi-context-cap
Version
0.1.2
Published
Jun 1, 2026
Downloads
479/mo · 312/wk
Author
diegopetrucci
License
MIT
Types
extension
Size
7.8 KB
Dependencies
0 dependencies · 2 peers
Pi manifest JSON
{
  "extensions": [
    "index.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

context-cap

A pi extension that treats large-context models as having an effective 200k-token context window, so pi's built-in auto-compaction starts earlier, and avoids the dumb zone.

By default, pi auto-compacts when:

contextTokens > model.contextWindow - reserveTokens

This extension changes the active model's in-memory contextWindow to:

min(originalContextWindow, 200000)

With pi's default reserveTokens of 16,384, models larger than 200k will proactively compact around 183,616 tokens.

Commands

/context-cap status
/context-cap off
/context-cap on
/context-cap toggle

The extension starts enabled by default. Disabling is temporary for the current extension runtime/session; after /reload, /new, /resume, or /fork, the extension starts enabled again.

Install

Standalone npm package

pi install npm:@diegopetrucci/pi-context-cap

Collection package

pi install npm:@diegopetrucci/pi-extensions

GitHub package

pi install git:github.com/diegopetrucci/pi-extensions

Then reload pi:

/reload

Notes

  • This extension mutates pi's in-memory model metadata only. It does not edit models.json.
  • The cap affects pi logic that reads model.contextWindow, including auto-compaction thresholding and UI context-window display.
  • Because pi also uses model.contextWindow for some overflow detection, a request that succeeds above 200k tokens on a larger model may be treated as overflow and retried after compaction. Use /context-cap off if you need the full model window temporarily.