pi-airgun

Pi extensions for LLM context compression and Anthropic prompt caching. Zero LLM inference cost.

Package details

← Back

extension

Install pi-airgun from npm and Pi will load the resources declared by the package manifest.

npm repo home report

$ pi install npm:pi-airgun

Package: pi-airgun
Version: 0.1.0
Published: Mar 23, 2026
Downloads: 49/mo · 15/wk
Author: blai
License: MIT
Types: extension
Size: 75.6 KB
Dependencies: 2 dependencies · 1 peer

Pi manifest JSON

{
  "extensions": [
    "./extensions"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-airgun

Two pi extensions for LLM context compression and Anthropic prompt caching. Zero LLM inference cost. No build step.

Install

# From GitHub
pi install github:blai/pi-airgun

# From npm
pi install npm:pi-airgun

Extensions

`compress` — Context Compressor

Intercepts every tool result and the LLM context window to remove token waste through a two-pipeline architecture.

Footer widget: 🗜 ~1,234 tok saved (28%) Command: /compress-stats — per-session breakdown (tiktoken-exact for immediate, ÷3.5 estimate for deferred)

Two-pipeline architecture

The split exists because some compressions are safe to persist (improve human readability) while others would be confusing in the UI and session files.

Pipeline	Hook	Stages	Persisted to session?	Token tracking
Immediate	`tool_result`	ansi → whitespace	✅ yes	tiktoken exact
Deferred	`context` (deep copy)	sep-norm → dedup → paths → toon → dyn-tokens	❌ no (LLM only)	÷3.5 estimate

Immediate stages (tool_result — what you see in the TUI)

Stage	What it removes	Why safe to persist
ansi	`\x1b[31m…\x1b[0m` escape codes	Noise in both UI and session files — stripping improves readability
whitespace	Trailing spaces, 3+ blank lines → 2	Improves TUI output density

Uses node:util.stripVTControlCharacters (Node 16.11+ built-in, zero deps).

Deferred stages (context hook — LLM only, originals in TUI)

Stage	What it does	Why deferred only
sep-norm	Long separator lines → 8 chars	Confusing to read truncated separators in TUI
dedup	`[4×] same line` markers	Markers confusing to read in TUI
paths	`$WS`/`$HOME` sigils + legend	Sigils confusing in TUI
toon	JSON → TOON format	Different syntax, unreadable without knowing TOON
dyn-tokens	Dynamic `$T1`, `$T2` sigils for repeated long tokens	Confusing without legend context

TOON (Token-Oriented Object Notation) collapses uniform JSON arrays into CSV-style tables:

# Before (JSON, 2680 chars)               # After (TOON, 858 chars, −68%)
[{"id":1,"name":"User 0","role":"admin"},  users[20]{id,name,role,active}:
 {"id":2,"name":"User 1","role":"user"},    1,User 0,admin,true
 ...]                                       2,User 1,user,true
                                            ...

Lossless round-trip. Only applied when TOON is ≥10% shorter.

Dependencies

Package	Version	Purpose
`@toon-format/toon`	^2.1.0	JSON → TOON encoding
`js-tiktoken`	^1.0.21	Exact BPE token counting (cl100k_base)

`cache` — Anthropic Prompt Caching

Adds cache_control: { type: "ephemeral" } to every Anthropic API request, enabling automatic caching.

How it works: Anthropic places the cache breakpoint at the last cacheable block automatically and moves it forward as the conversation grows. On a cache hit, the cached prefix is charged at 10% of normal input token price.

Economics: ~2-3 turn break-even. From turn 3+, system prompt + tool definitions + early conversation history are read from cache at 10x discount.

Minimum prompt: 1024–4096 tokens depending on model (Anthropic silently skips caching for shorter prompts — no error, no extra charge).

Provider guard: only adds cache_control when payload.model starts with "claude-". OpenAI, Google, and other providers are passed through unchanged.

Files

pi-airgun/
├── package.json
├── README.md
├── vitest.config.ts
├── tsconfig.json
├── tests/
│   ├── pipeline.test.ts          runImmediatePipeline / runDeferredPipeline integration
│   ├── stages/
│   │   ├── ansi.test.ts
│   │   ├── whitespace.test.ts
│   │   ├── dedup.test.ts
│   │   ├── separator.test.ts
│   │   ├── paths.test.ts
│   │   ├── toon.test.ts
│   │   ├── tokens.test.ts
│   │   └── tokens_dyn.test.ts
│   └── bench/
│       ├── stages.bench.ts       Per-stage microbenchmarks
│       └── pipeline.bench.ts     Full pipeline benchmarks + cache effectiveness
└── extensions/
    ├── compress/
    │   ├── index.ts              Extension entry: session_start / tool_result / context hooks + /compress-stats
    │   ├── pipeline.ts           runImmediatePipeline() + runDeferredPipeline()
    │   └── stages/
    │       ├── ansi.ts           node:util.stripVTControlCharacters wrapper
    │       ├── whitespace.ts     Normalize blank lines and trailing spaces
    │       ├── dedup.ts          Consecutive duplicate line folding
    │       ├── separator.ts      Separator line normalizer
    │       ├── paths.ts          Path → $WS/$HOME sigil compression
    │       ├── toon.ts           JSON → TOON encoding via @toon-format/toon
    │       ├── tokens.ts         js-tiktoken wrapper + fast ÷3.5 estimator
    │       └── tokens_dyn.ts     Dynamic repeated-token compressor
    └── cache/
        └── index.ts              before_provider_request → add Anthropic cache_control

Development

# Install dependencies
npm install

# Run tests
npm test

# Run benchmarks
npm run bench

# Type check
npm run typecheck

Ideas for future stages

Log timestamp folding: group log lines by repeating prefix, show count + time range
Import block dedup: for code file reads, deduplicate repeated import sections
Cross-message dedup: if the same file is read twice, replace the second with a back-reference
1-hour cache TTL: add ttl: { type: "hours", amount: 1 } to cache_control for long sessions

pi-airgun

Install

Extensions

compress — Context Compressor

Two-pipeline architecture

Immediate stages (tool_result — what you see in the TUI)

Deferred stages (context hook — LLM only, originals in TUI)

Dependencies

cache — Anthropic Prompt Caching

Files

Development

Ideas for future stages

`compress` — Context Compressor

`cache` — Anthropic Prompt Caching