pi-defluffer

Pi extension that safely defluffs user prompts before sending them to the model.

Packages

Package details

extension

Install pi-defluffer from npm and Pi will load the resources declared by the package manifest.

npm repo home report

$ pi install npm:pi-defluffer

Package: pi-defluffer
Version: 0.1.0
Published: Jun 8, 2026
Downloads: not available
Author: respectmathias
License: MIT
Types: extension
Size: 58.8 KB
Dependencies: 0 dependencies · 1 peer

Pi manifest JSON

{
  "extensions": [
    "./src/index.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-defluffer

Pi extension that trims polite/filler text from user prompts before sending them to the model.

Inspired by GrahamTheDev's Defluffer idea: https://dev.to/grahamthedev/defluffer-reduce-token-usage-by-45-26jj

This is not magic compression. It is conservative prompt cleanup:

removes pleasantries and filler
collapses common phrases
preserves code blocks, inline code, URLs, env vars, CLI flags, quoted strings
protects detected JSON/YAML snippets while compressing surrounding text
skips legal/policy/exact-file-output prompts
skips slash commands and steering input
shows a small TUI widget with estimated token savings

Install

pi install npm:pi-defluffer

From GitHub:

pi install git:github.com/RespectMathias/pi-defluffer

Commands

/defluff on
/defluff off
/defluff status
/defluff stats
/defluff profile off|safe|standardGuardedDedupe
/defluff min <0-80>
/defluff preview <text>
/defluff animation widget|off
/defluff reset-stats

Default profile: standardGuardedDedupe.

Testing

Run local tests:

npm test

This runs Vitest unit tests plus a pi extension load check.

We also tested whether prompt compression saved tokens without damaging intent.

Prompt-integrity proxy test

20 prompt fixtures across code, JSON/YAML, debugging, legal, math, creative, transcript, and agent tasks.

Variant	Avg input savings	Quality proxy	Critical failures
no script	0.0%	5.00	0
original-ish/basic	13.7%	4.74	4
aggressive/extended	13.7%	4.78	3
standard/extended	8.6%	4.89	1
safe/extended	5.6%	5.00	0
guarded standard	8.1%	5.00	0
guarded + transcript dedupe	9.5%	5.00	0

Takeaway: aggressive compression saves more input tokens, but breaks exact terms. Guarded compression is safer.

LLM A/B test with Codex

We spawned Codex processes for baseline vs defluffed prompts, then used Codex as judge.

Metric	Result
fixtures	6
avg input savings	16.29%
avg baseline score	4.83
avg defluffed score	4.50
judge ties	5
baseline wins	1
compression-caused critical losses	0

Case breakdown:

Case	Category	Input saved	Judge result
demo_001	code refactor	33.65%	tie
code_002	code generation	4.23%	tie
json_004	structured	2.94%	tie
transcript_009	transcript	29.82%	tie
legal_013	legal	0.00%	baseline win, not compression-caused
copy_017	creative	27.08%	tie

Estimated total tokens in LLM test went from 1779 to 1707, about 4% saved. Output length can grow, so input savings do not always equal total savings.

Practical conclusion

Use defluffing for:

polite prompts
noisy transcript prompts
long natural-language task descriptions
code tasks with fluff outside code blocks

Avoid or skip compression for:

legal text
policy text
exact file-output prompts
prompts where exact wording matters globally

For exact JSON prompts, the extension protects detected JSON/YAML snippets and still compresses safe surrounding prose.

This extension defaults to skip risky cases.