pi-airgun
Pi extensions for LLM context compression and Anthropic prompt caching. Zero LLM inference cost.
Package details
Install pi-airgun from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-airgun- Package
pi-airgun- Version
0.1.0- Published
- Mar 23, 2026
- Downloads
- 49/mo ยท 15/wk
- Author
- blai
- License
- MIT
- Types
- extension
- Size
- 75.6 KB
- Dependencies
- 2 dependencies ยท 1 peer
Pi manifest JSON
{
"extensions": [
"./extensions"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pi-airgun
Two pi extensions for LLM context compression and Anthropic prompt caching. Zero LLM inference cost. No build step.
Install
# From GitHub
pi install github:blai/pi-airgun
# From npm
pi install npm:pi-airgun
Extensions
compress โ Context Compressor
Intercepts every tool result and the LLM context window to remove token waste through a two-pipeline architecture.
Footer widget: ๐ ~1,234 tok saved (28%)
Command: /compress-stats โ per-session breakdown (tiktoken-exact for immediate, รท3.5 estimate for deferred)
Two-pipeline architecture
The split exists because some compressions are safe to persist (improve human readability) while others would be confusing in the UI and session files.
| Pipeline | Hook | Stages | Persisted to session? | Token tracking |
|---|---|---|---|---|
| Immediate | tool_result |
ansi โ whitespace | โ yes | tiktoken exact |
| Deferred | context (deep copy) |
sep-norm โ dedup โ paths โ toon โ dyn-tokens | โ no (LLM only) | รท3.5 estimate |
Immediate stages (tool_result โ what you see in the TUI)
| Stage | What it removes | Why safe to persist |
|---|---|---|
| ansi | \x1b[31mโฆ\x1b[0m escape codes |
Noise in both UI and session files โ stripping improves readability |
| whitespace | Trailing spaces, 3+ blank lines โ 2 | Improves TUI output density |
Uses node:util.stripVTControlCharacters (Node 16.11+ built-in, zero deps).
Deferred stages (context hook โ LLM only, originals in TUI)
| Stage | What it does | Why deferred only |
|---|---|---|
| sep-norm | Long separator lines โ 8 chars | Confusing to read truncated separators in TUI |
| dedup | [4ร] same line markers |
Markers confusing to read in TUI |
| paths | $WS/$HOME sigils + legend |
Sigils confusing in TUI |
| toon | JSON โ TOON format | Different syntax, unreadable without knowing TOON |
| dyn-tokens | Dynamic $T1, $T2 sigils for repeated long tokens |
Confusing without legend context |
TOON (Token-Oriented Object Notation) collapses uniform JSON arrays into CSV-style tables:
# Before (JSON, 2680 chars) # After (TOON, 858 chars, โ68%)
[{"id":1,"name":"User 0","role":"admin"}, users[20]{id,name,role,active}:
{"id":2,"name":"User 1","role":"user"}, 1,User 0,admin,true
...] 2,User 1,user,true
...
Lossless round-trip. Only applied when TOON is โฅ10% shorter.
Dependencies
| Package | Version | Purpose |
|---|---|---|
@toon-format/toon |
^2.1.0 | JSON โ TOON encoding |
js-tiktoken |
^1.0.21 | Exact BPE token counting (cl100k_base) |
cache โ Anthropic Prompt Caching
Adds cache_control: { type: "ephemeral" } to every Anthropic API request, enabling automatic caching.
How it works: Anthropic places the cache breakpoint at the last cacheable block automatically and moves it forward as the conversation grows. On a cache hit, the cached prefix is charged at 10% of normal input token price.
Economics: ~2-3 turn break-even. From turn 3+, system prompt + tool definitions + early conversation history are read from cache at 10x discount.
Minimum prompt: 1024โ4096 tokens depending on model (Anthropic silently skips caching for shorter prompts โ no error, no extra charge).
Provider guard: only adds cache_control when payload.model starts with "claude-". OpenAI, Google, and other providers are passed through unchanged.
Files
pi-airgun/
โโโ package.json
โโโ README.md
โโโ vitest.config.ts
โโโ tsconfig.json
โโโ tests/
โ โโโ pipeline.test.ts runImmediatePipeline / runDeferredPipeline integration
โ โโโ stages/
โ โ โโโ ansi.test.ts
โ โ โโโ whitespace.test.ts
โ โ โโโ dedup.test.ts
โ โ โโโ separator.test.ts
โ โ โโโ paths.test.ts
โ โ โโโ toon.test.ts
โ โ โโโ tokens.test.ts
โ โ โโโ tokens_dyn.test.ts
โ โโโ bench/
โ โโโ stages.bench.ts Per-stage microbenchmarks
โ โโโ pipeline.bench.ts Full pipeline benchmarks + cache effectiveness
โโโ extensions/
โโโ compress/
โ โโโ index.ts Extension entry: session_start / tool_result / context hooks + /compress-stats
โ โโโ pipeline.ts runImmediatePipeline() + runDeferredPipeline()
โ โโโ stages/
โ โโโ ansi.ts node:util.stripVTControlCharacters wrapper
โ โโโ whitespace.ts Normalize blank lines and trailing spaces
โ โโโ dedup.ts Consecutive duplicate line folding
โ โโโ separator.ts Separator line normalizer
โ โโโ paths.ts Path โ $WS/$HOME sigil compression
โ โโโ toon.ts JSON โ TOON encoding via @toon-format/toon
โ โโโ tokens.ts js-tiktoken wrapper + fast รท3.5 estimator
โ โโโ tokens_dyn.ts Dynamic repeated-token compressor
โโโ cache/
โโโ index.ts before_provider_request โ add Anthropic cache_control
Development
# Install dependencies
npm install
# Run tests
npm test
# Run benchmarks
npm run bench
# Type check
npm run typecheck
Ideas for future stages
- Log timestamp folding: group log lines by repeating prefix, show count + time range
- Import block dedup: for code file reads, deduplicate repeated import sections
- Cross-message dedup: if the same file is read twice, replace the second with a back-reference
- 1-hour cache TTL: add
ttl: { type: "hours", amount: 1 }to cache_control for long sessions