@rohaquinlop/pi-deepseek-cache
DeepSeek prefix cache optimization for pi — date/CWD freeze, hit-rate telemetry, cache-friendly compaction, and TUI overlays
Package details
Install @rohaquinlop/pi-deepseek-cache from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:@rohaquinlop/pi-deepseek-cache- Package
@rohaquinlop/pi-deepseek-cache- Version
1.3.1- Published
- Jun 21, 2026
- Downloads
- not available
- Author
- rohaquinlop
- License
- MIT
- Types
- extension
- Size
- 39.3 KB
- Dependencies
- 0 dependencies · 3 peers
Pi manifest JSON
{
"extensions": [
"./extensions/index.ts"
],
"appliesToModels": [
"deepseek-*",
"deepseek"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pi-deepseek-cache
Reduce DeepSeek API costs by 95%+ through multi-layered prefix cache optimization. Zero configuration — auto-detects DeepSeek models and applies best practices transparently.
The Problem
DeepSeek's API uses prefix caching — identical prompt prefixes served from disk cache at 50–120× lower cost than fresh computation. But the cache only works when every byte from position 0 is identical across requests.
Pi's default system prompt embeds Current date: YYYY-MM-DD and Current working directory: <cwd> — dynamic values that change daily and per session, silently busting the entire prefix cache.
What This Extension Does
| Layer | Feature | Impact |
|---|---|---|
| P0 | Date & CWD freeze | Root-cause fix — locks session date and directory, preventing daily/per-session cache bust |
| P1 | Hit-rate telemetry | Per-session hit rate shown as dimmed footer status line; /cache-stats & /cache-graph for detail |
| P2 | Prefix guard | SHA-256 hash diagnostics — tracks prefix breaks (viewable in /cache-stats) |
| P3 | Cache-friendly compaction | Deterministic summarization via deepseek-v4-flash at temperature 0, SHA-256 cached for stable replays |
| P4 | TUI overlays | /cache-stats popup with hit rate, tokens, cost savings. /cache-graph ASCII trend chart |
Cost Impact
| Without Extension | With Extension | |
|---|---|---|
| deepseek-v4-flash input | $0.14/M tokens | $0.003/M tokens (98% less) |
| deepseek-v4-pro input | $3.00/M tokens | $0.025/M tokens (99% less) |
Installation
pi install npm:@rohaquinlop/pi-deepseek-cache
Or via git:
pi install git:github.com/rohaquinlop/pi-deepseek-cache
The extension activates automatically. No configuration needed. The per-session
cache hit rate appears as a dimmed status line (Cache 96.2%) in Pi's footer.
Pi's native CH:XX.X% shows the per-turn rate in the stats line. Detailed
stats are available via /cache-stats and /cache-graph commands.
Each pi session writes its own stats-{sessionId}.json and
history-{sessionId}.json files, so concurrent sessions never race on the
same file. Session files older than 30 days are cleaned up automatically.
Provider Support
Works with any provider serving DeepSeek models:
- Any provider with
deepseek-*model IDs (NaN Builders, OpenRouter, custom proxies, etc.) - DeepSeek API (
deepseekprovider) — direct API users
Non-DeepSeek models pass through unchanged.
Subagent Compatibility
This extension automatically applies to subagent processes that use DeepSeek
models. It declares appliesToModels: ["deepseek-*", "deepseek"] in its
package.json, which the pi-subagents
extension detects and loads into child processes — no configuration needed.
For the best cache performance, ensure both extensions are installed:
pi install npm:@rohaquinlop/pi-subagents
pi install npm:@rohaquinlop/pi-deepseek-cache
Commands
/cache-stats
Overlay popup showing two sections: this session's stats and an aggregate across all sessions (N sessions). Each section shows hit rate, cache read/write/input tokens, turns, and estimated cost savings.
/cache-graph
ASCII trend chart of hit rate over turns — helps spot regressions.
/cache-reset
Clears all cached statistics, history, and summary cache — deletes all per-session stats-*.json and history-*.json files plus the summary cache. Useful after major prompt changes.
How It Works
P0 (Date/CWD freeze): On before_agent_start, replaces the dynamic Current date and Current working directory lines with values frozen at session start. The system prompt prefix stays byte-identical across the entire session.
P1 (Telemetry): Accumulates cacheRead, input, cacheWrite, and turns from every assistant message's usage data. Each session stores its stats in stats-{sessionId}.json so concurrent sessions never race. /cache-stats shows both this session's stats and an aggregate across all sessions.
P2 (Prefix guard): On before_provider_request, SHA-256 hashes all messages except the last to fingerprint the prefix. Tracks when the hash changes — the break count is visible in /cache-stats.
P3 (Compaction): On session_before_compact, summarizes conversation history with deepseek-v4-flash at temperature 0. Summaries are SHA-256 hashed and cached — identical histories produce byte-identical summaries, keeping compaction cache-stable.
P4 (Overlays): /cache-stats and /cache-graph render as TUI overlay popups (Esc to dismiss) with formatted hit-rate data and ASCII trend charts.
License
MIT