pi-defluffer

Pi extension that safely defluffs user prompts before sending them to the model.

Packages

Package details

extension

Install pi-defluffer from npm and Pi will load the resources declared by the package manifest.

$ pi install npm:pi-defluffer
Package
pi-defluffer
Version
0.1.0
Published
Jun 8, 2026
Downloads
not available
Author
respectmathias
License
MIT
Types
extension
Size
58.8 KB
Dependencies
0 dependencies · 1 peer
Pi manifest JSON
{
  "extensions": [
    "./src/index.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-defluffer

Pi extension that trims polite/filler text from user prompts before sending them to the model.

Inspired by GrahamTheDev's Defluffer idea: https://dev.to/grahamthedev/defluffer-reduce-token-usage-by-45-26jj

This is not magic compression. It is conservative prompt cleanup:

  • removes pleasantries and filler
  • collapses common phrases
  • preserves code blocks, inline code, URLs, env vars, CLI flags, quoted strings
  • protects detected JSON/YAML snippets while compressing surrounding text
  • skips legal/policy/exact-file-output prompts
  • skips slash commands and steering input
  • shows a small TUI widget with estimated token savings

Install

pi install npm:pi-defluffer

From GitHub:

pi install git:github.com/RespectMathias/pi-defluffer

Commands

/defluff on
/defluff off
/defluff status
/defluff stats
/defluff profile off|safe|standardGuardedDedupe
/defluff min <0-80>
/defluff preview <text>
/defluff animation widget|off
/defluff reset-stats

Default profile: standardGuardedDedupe.

Testing

Run local tests:

npm test

This runs Vitest unit tests plus a pi extension load check.

We also tested whether prompt compression saved tokens without damaging intent.

Prompt-integrity proxy test

20 prompt fixtures across code, JSON/YAML, debugging, legal, math, creative, transcript, and agent tasks.

Variant Avg input savings Quality proxy Critical failures
no script 0.0% 5.00 0
original-ish/basic 13.7% 4.74 4
aggressive/extended 13.7% 4.78 3
standard/extended 8.6% 4.89 1
safe/extended 5.6% 5.00 0
guarded standard 8.1% 5.00 0
guarded + transcript dedupe 9.5% 5.00 0

Takeaway: aggressive compression saves more input tokens, but breaks exact terms. Guarded compression is safer.

LLM A/B test with Codex

We spawned Codex processes for baseline vs defluffed prompts, then used Codex as judge.

Metric Result
fixtures 6
avg input savings 16.29%
avg baseline score 4.83
avg defluffed score 4.50
judge ties 5
baseline wins 1
compression-caused critical losses 0

Case breakdown:

Case Category Input saved Judge result
demo_001 code refactor 33.65% tie
code_002 code generation 4.23% tie
json_004 structured 2.94% tie
transcript_009 transcript 29.82% tie
legal_013 legal 0.00% baseline win, not compression-caused
copy_017 creative 27.08% tie

Estimated total tokens in LLM test went from 1779 to 1707, about 4% saved. Output length can grow, so input savings do not always equal total savings.

Practical conclusion

Use defluffing for:

  • polite prompts
  • noisy transcript prompts
  • long natural-language task descriptions
  • code tasks with fluff outside code blocks

Avoid or skip compression for:

  • legal text
  • policy text
  • exact file-output prompts
  • prompts where exact wording matters globally

For exact JSON prompts, the extension protects detected JSON/YAML snippets and still compresses safe surrounding prose.

This extension defaults to skip risky cases.