pi-ultra-compact

Advanced compaction extension and skill for Pi with automatic threshold-based compaction and critical context preservation

Packages

Package details

extensionskill

Install pi-ultra-compact from npm and Pi will load the resources declared by the package manifest.

$ pi install npm:pi-ultra-compact
Package
pi-ultra-compact
Version
0.8.0
Published
Jun 17, 2026
Downloads
2,428/mo · 2,428/wk
Author
realvendex
License
MIT
Types
extension, skill
Size
87.3 KB
Dependencies
0 dependencies · 3 peers
Pi manifest JSON
{
  "extensions": [
    "./extensions"
  ],
  "skills": [
    "./skills"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-ultra-compact

Advanced compaction extension and skill for Pi with automatic threshold-based compaction and support for 200+ models.

Pi Package License: MIT npm version

Features

  • /ultracompact command for manual compaction
  • Auto-adapts threshold to model's context window (60-80% of max)
  • 200+ models supported - OpenAI, Anthropic, Google, DeepSeek, Meta, Mistral, Qwen, and more
  • Graduated Eviction (4 levels) — strips reasoning, bulk outputs, artifacts, then messages
  • Generational Compaction — micro (fast, no LLM) at 60-90%, full at 90%+
  • Preemptive Trigger — fires before next turn, never pays latency during user turns
  • Cache-Aware Compaction — immutable summary blocks keep prompt cache warm
  • Circuit Breaker — 3 strikes → lossy truncation fallback, session never dies
  • Hierarchical summarization with entropy-based information extraction
  • Critical context preservation - goals, decisions, errors, file paths
  • Extension + Skill - works as both a Pi extension and a skill
  • Smart model switching - remembers per-model thresholds and preserves custom settings
  • Conversation structure detection - identifies turns, phases, and progress
  • Multi-pass summarization — progressive compression with quality scoring
  • LLM-based summarization — optional AI-powered compression (useLLM config)
  • Content-aware token counting — dynamic ratios for code, prose, and whitespace
  • Compact section templates — shorter headers, condensed formatting, saves 10-15% more tokens

Installation

pi install npm:pi-ultra-compact

Quick Start

After installation and restarting Pi, use:

/ultracompact

This triggers manual ultra-compact compaction.

Auto-compaction triggers automatically when context exceeds 80% of your model's context window.

Supported Models

Provider Models Context Window
OpenAI GPT-5/5.1/5.2, GPT-4.1, GPT-4o, O3, O4-mini 8K - 1M tokens
Anthropic Claude 4.5/4.0/3.7/3.5/3 200K tokens
Google Gemini 2.5/2.0/1.5, Gemma 3/2 32K - 2M tokens
DeepSeek V4 Pro, V3, V2.5, R1 64K - 1M tokens
Meta Llama 4, 3.3, 3.1, 3, 2 4K - 1M tokens
Mistral Medium 3.5, Large 3, Small 4, Codestral 32K - 256K tokens
Qwen Qwen3, Qwen2.5, Qwen2 32K - 128K tokens
Microsoft Phi-4, Phi-3, Phi-2 2K - 32K tokens
xAI Grok 3, Grok 2 8K - 131K tokens
Cohere Command R+ 128K tokens
Yi Yi-1.5, Yi-34B 4K - 200K tokens

How It Works

Three-Tier System

  1. Preemptive check (every turn): Projects next turn's token usage. If projected > 60% of context, triggers micro-compaction.
  2. Micro-compaction (60-90% usage): Strips reasoning blocks + bulk tool outputs. No LLM call. Runs in microseconds.
  3. Full compaction (90%+ usage): Graduated eviction preconditions the input, then structured summarization produces the final compacted context.

Eviction Levels

Level What it strips When
1 Assistant thinking/reasoning blocks Always (harmless removal)
2 Bulk tool outputs (>100 lines, >5K chars) Most sessions
3 All non-error tool results Heavy sessions
4 Oldest non-protected messages Only when necessary

Safety Systems

  • Snapshot-rollback: Messages are deep-copied before compaction. If anything fails, the original is preserved.
  • Circuit breaker: After 3 consecutive failures, falls back to lossy truncation (keep system + last 10 turns).
  • User messages inviolable: Never stripped regardless of token pressure.
  • Cache-aware mode: Previous summaries stay immutable — only new content pays prefill cost.

Configuration

Default settings work out of the box. The extension auto-detects your model and sets appropriate thresholds.

Default Settings

Setting Default Description
thresholdTokens Auto (60-80% of context) When to trigger compaction
keepPercentage 30% Percentage of context to keep
maxKeepTokens 30,000 Maximum tokens to keep
autoCompact true Enable automatic compaction
cacheAware false Immutable summary blocks (saves API costs)
maxEvictionLevel FULL_REMOVAL Max eviction aggressiveness
outputHeadroom 4,096 Tokens reserved for LLM response
circuitBreakerMaxFailures 3 Failures before lossy truncation
preemptiveWatermark 0.70 Preemptive trigger level
hardWatermark 0.95 Reactive fallback level

Commands

Command Description
/ultracompact Trigger manual ultra-compact compaction

Model Examples

# Works with any model - threshold auto-adapts
# Claude Opus: 160,000 tokens (80% of 200K)
# GPT-5: 320,000 tokens (80% of 400K)
# Gemini 2.5 Pro: 800,000 tokens (80% of 1M)
# DeepSeek V4 Pro: 800,000 tokens (80% of 1M)

Compatibility

  • Works with any Pi-compatible model
  • Compatible with gentle-engram (Engram memory backup)
  • Compatible with gentle-pi (SDD/OpenSpec)
  • No conflicts with Pi's default compaction

Changelog

See CHANGELOG.md for full version history.

v0.8.0 - Generational Compaction + Safety Systems

  • Graduated Eviction — 4-level content stripping (reasoning → bulk → artifacts → full)
  • Generational Compaction — micro (60-90%, no LLM) + full (90%+) tiers
  • Preemptive Trigger — fires at 70% watermark by projecting next turn
  • Cache-Aware Mode — immutable summary blocks preserve prompt cache
  • Snapshot-Rollback + Circuit Breaker — session never dies from bad compaction
  • 66 tests, 100% pass rate — zero regressions

v0.7.0 - Compact Templates & LLM Summarization

  • Compact section templates - shorter headers save 10-15% tokens across all conversations
  • LLM-based summarization - optional LLM-powered semantic compression
  • Content-aware token estimation - dynamic ratios for code/prose/whitespace
  • 66 tests, 100% pass rate - including 13 new effectiveness benchmarks
  • generateSummary is now async - supports LLM callback integration

v0.6.0 - Algorithm Enhancement Release

Major improvements to compaction quality and performance:

  • Smart model switching - per-model threshold memory, preserves custom settings
  • Conversation structure detection - identifies turns, phases, progress
  • Enhanced critical extraction - progress indicators, questions, user preferences
  • Multi-pass summarization - 3-pass compression with quality scoring
  • Token estimation cache - LRU cache for 3x faster performance
  • 100% test pass rate - 43 unit tests + 17 performance benchmarks

v0.5.0 - Audit & Stability Release

This release fixes 18 issues found via comprehensive 5-agent audit:

  • 3 Critical regex bugs fixed - \b word boundaries on all patterns, no more false matches
  • Startup model detection fixed - correct threshold from boot
  • Custom thresholds preserved - across model switches
  • Null safety - guards on all message-consuming methods
  • 53-test Jest suite - comprehensive coverage
  • Dead code removed - 329-line .disabled file deleted, unused typebox dep removed

Troubleshooting

Extension not loading

  • Restart Pi after installation
  • Check pi install npm:pi-ultra-compact completed successfully

Wrong threshold detected

  • The extension auto-detects your model from Pi config
  • Ensure your model is in the supported list (200+ models)
  • Run /ultracompact manually to see detected model and threshold in the logs

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'feat: add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Pi - The AI coding agent