pi-ultra-compact
Advanced compaction extension and skill for Pi with automatic threshold-based compaction and critical context preservation
Package details
Install pi-ultra-compact from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-ultra-compact- Package
pi-ultra-compact- Version
0.8.0- Published
- Jun 17, 2026
- Downloads
- 2,428/mo · 2,428/wk
- Author
- realvendex
- License
- MIT
- Types
- extension, skill
- Size
- 87.3 KB
- Dependencies
- 0 dependencies · 3 peers
Pi manifest JSON
{
"extensions": [
"./extensions"
],
"skills": [
"./skills"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pi-ultra-compact
Advanced compaction extension and skill for Pi with automatic threshold-based compaction and support for 200+ models.
Features
/ultracompactcommand for manual compaction- Auto-adapts threshold to model's context window (60-80% of max)
- 200+ models supported - OpenAI, Anthropic, Google, DeepSeek, Meta, Mistral, Qwen, and more
- Graduated Eviction (4 levels) — strips reasoning, bulk outputs, artifacts, then messages
- Generational Compaction — micro (fast, no LLM) at 60-90%, full at 90%+
- Preemptive Trigger — fires before next turn, never pays latency during user turns
- Cache-Aware Compaction — immutable summary blocks keep prompt cache warm
- Circuit Breaker — 3 strikes → lossy truncation fallback, session never dies
- Hierarchical summarization with entropy-based information extraction
- Critical context preservation - goals, decisions, errors, file paths
- Extension + Skill - works as both a Pi extension and a skill
- Smart model switching - remembers per-model thresholds and preserves custom settings
- Conversation structure detection - identifies turns, phases, and progress
- Multi-pass summarization — progressive compression with quality scoring
- LLM-based summarization — optional AI-powered compression (useLLM config)
- Content-aware token counting — dynamic ratios for code, prose, and whitespace
- Compact section templates — shorter headers, condensed formatting, saves 10-15% more tokens
Installation
pi install npm:pi-ultra-compact
Quick Start
After installation and restarting Pi, use:
/ultracompact
This triggers manual ultra-compact compaction.
Auto-compaction triggers automatically when context exceeds 80% of your model's context window.
Supported Models
| Provider | Models | Context Window |
|---|---|---|
| OpenAI | GPT-5/5.1/5.2, GPT-4.1, GPT-4o, O3, O4-mini | 8K - 1M tokens |
| Anthropic | Claude 4.5/4.0/3.7/3.5/3 | 200K tokens |
| Gemini 2.5/2.0/1.5, Gemma 3/2 | 32K - 2M tokens | |
| DeepSeek | V4 Pro, V3, V2.5, R1 | 64K - 1M tokens |
| Meta | Llama 4, 3.3, 3.1, 3, 2 | 4K - 1M tokens |
| Mistral | Medium 3.5, Large 3, Small 4, Codestral | 32K - 256K tokens |
| Qwen | Qwen3, Qwen2.5, Qwen2 | 32K - 128K tokens |
| Microsoft | Phi-4, Phi-3, Phi-2 | 2K - 32K tokens |
| xAI | Grok 3, Grok 2 | 8K - 131K tokens |
| Cohere | Command R+ | 128K tokens |
| Yi | Yi-1.5, Yi-34B | 4K - 200K tokens |
How It Works
Three-Tier System
- Preemptive check (every turn): Projects next turn's token usage. If projected > 60% of context, triggers micro-compaction.
- Micro-compaction (60-90% usage): Strips reasoning blocks + bulk tool outputs. No LLM call. Runs in microseconds.
- Full compaction (90%+ usage): Graduated eviction preconditions the input, then structured summarization produces the final compacted context.
Eviction Levels
| Level | What it strips | When |
|---|---|---|
| 1 | Assistant thinking/reasoning blocks | Always (harmless removal) |
| 2 | Bulk tool outputs (>100 lines, >5K chars) | Most sessions |
| 3 | All non-error tool results | Heavy sessions |
| 4 | Oldest non-protected messages | Only when necessary |
Safety Systems
- Snapshot-rollback: Messages are deep-copied before compaction. If anything fails, the original is preserved.
- Circuit breaker: After 3 consecutive failures, falls back to lossy truncation (keep system + last 10 turns).
- User messages inviolable: Never stripped regardless of token pressure.
- Cache-aware mode: Previous summaries stay immutable — only new content pays prefill cost.
Configuration
Default settings work out of the box. The extension auto-detects your model and sets appropriate thresholds.
Default Settings
| Setting | Default | Description |
|---|---|---|
thresholdTokens |
Auto (60-80% of context) | When to trigger compaction |
keepPercentage |
30% | Percentage of context to keep |
maxKeepTokens |
30,000 | Maximum tokens to keep |
autoCompact |
true | Enable automatic compaction |
cacheAware |
false | Immutable summary blocks (saves API costs) |
maxEvictionLevel |
FULL_REMOVAL | Max eviction aggressiveness |
outputHeadroom |
4,096 | Tokens reserved for LLM response |
circuitBreakerMaxFailures |
3 | Failures before lossy truncation |
preemptiveWatermark |
0.70 | Preemptive trigger level |
hardWatermark |
0.95 | Reactive fallback level |
Commands
| Command | Description |
|---|---|
/ultracompact |
Trigger manual ultra-compact compaction |
Model Examples
# Works with any model - threshold auto-adapts
# Claude Opus: 160,000 tokens (80% of 200K)
# GPT-5: 320,000 tokens (80% of 400K)
# Gemini 2.5 Pro: 800,000 tokens (80% of 1M)
# DeepSeek V4 Pro: 800,000 tokens (80% of 1M)
Compatibility
- Works with any Pi-compatible model
- Compatible with gentle-engram (Engram memory backup)
- Compatible with gentle-pi (SDD/OpenSpec)
- No conflicts with Pi's default compaction
Changelog
See CHANGELOG.md for full version history.
v0.8.0 - Generational Compaction + Safety Systems
- Graduated Eviction — 4-level content stripping (reasoning → bulk → artifacts → full)
- Generational Compaction — micro (60-90%, no LLM) + full (90%+) tiers
- Preemptive Trigger — fires at 70% watermark by projecting next turn
- Cache-Aware Mode — immutable summary blocks preserve prompt cache
- Snapshot-Rollback + Circuit Breaker — session never dies from bad compaction
- 66 tests, 100% pass rate — zero regressions
v0.7.0 - Compact Templates & LLM Summarization
- Compact section templates - shorter headers save 10-15% tokens across all conversations
- LLM-based summarization - optional LLM-powered semantic compression
- Content-aware token estimation - dynamic ratios for code/prose/whitespace
- 66 tests, 100% pass rate - including 13 new effectiveness benchmarks
- generateSummary is now async - supports LLM callback integration
v0.6.0 - Algorithm Enhancement Release
Major improvements to compaction quality and performance:
- Smart model switching - per-model threshold memory, preserves custom settings
- Conversation structure detection - identifies turns, phases, progress
- Enhanced critical extraction - progress indicators, questions, user preferences
- Multi-pass summarization - 3-pass compression with quality scoring
- Token estimation cache - LRU cache for 3x faster performance
- 100% test pass rate - 43 unit tests + 17 performance benchmarks
v0.5.0 - Audit & Stability Release
This release fixes 18 issues found via comprehensive 5-agent audit:
- 3 Critical regex bugs fixed -
\bword boundaries on all patterns, no more false matches - Startup model detection fixed - correct threshold from boot
- Custom thresholds preserved - across model switches
- Null safety - guards on all message-consuming methods
- 53-test Jest suite - comprehensive coverage
- Dead code removed - 329-line
.disabledfile deleted, unusedtypeboxdep removed
Troubleshooting
Extension not loading
- Restart Pi after installation
- Check
pi install npm:pi-ultra-compactcompleted successfully
Wrong threshold detected
- The extension auto-detects your model from Pi config
- Ensure your model is in the supported list (200+ models)
- Run
/ultracompactmanually to see detected model and threshold in the logs
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'feat: add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Pi - The AI coding agent