pi-token-speed
Pi extension to measure tokens per second via sliding window.
Package details
Install pi-token-speed from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-token-speed- Package
pi-token-speed- Version
0.5.0- Published
- Jun 14, 2026
- Downloads
- 1,280/mo · 381/wk
- Author
- gsanhueza
- License
- MIT
- Types
- extension
- Size
- 36 KB
- Dependencies
- 0 dependencies · 2 peers
Pi manifest JSON
{
"extensions": [
"./index.ts"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pi-token-speed
A Pi Coding Agent extension that displays real-time tokens-per-second (TPS) performance metrics in the status bar while the AI is streaming responses.
Features
- Real-time TPS tracking — measures token throughput as the assistant generates text and thinking content
- Time-to-first-token (TTFT) — measures latency from user message to the first token being generated
- Configurable sliding window — adjust the window size to suit your server speed (default: 1s)
- Color-coded speed indicators — visual feedback based on performance thresholds
- Provider-reported counting — opt in to using provider-reported counts (e.g. Anthropic, OpenAI) instead of the extension's own counter
- Fully configurable — customize display, thresholds and colors via
~/.pi/agent/settings.json
Speed Tiers
| Tier | TPS | Color |
|---|---|---|
| 🟥 Slow | 0–15 | #ff4444 (red) |
| 🟨 Medium | 15–30 | #ffaa00 (orange) |
| 🟩 Fast | 30–45 | #00ff88 (green) |
| 🟦 Blazing | 45+ | #44ddff (cyan) |
Installation
This package is a Pi extension. Install it with
npm install pi-token-speed
or
pi install https://github.com/gsanhueza/pi-token-speed
Configuration
You can customize the display, speed thresholds and colors by adding a tokenSpeed section to your ~/.pi/agent/settings.json:
{
"tokenSpeed": {
"display": "tps",
"tpsSlow": 0,
"tpsMedium": 15,
"tpsFast": 30,
"tpsBlazing": 45,
"colorSlow": "#ff4444",
"colorMedium": "#ffaa00",
"colorFast": "#00ff88",
"colorBlazing": "#44ddff",
"slidingWindow": 1000,
"useProviderTokens": false,
"countStrategy": "direct"
}
}
Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
display |
tps, ttft, stats, full |
tps |
Which metrics to display (see Display Modes below) |
tpsSlow |
number | 0 |
Minimum TPS threshold ("slow") |
tpsMedium |
number | 15 |
TPS above this is "medium" |
tpsFast |
number | 30 |
TPS above this is "fast" |
tpsBlazing |
number | 45 |
TPS above this is "blazing" |
colorSlow |
string | "#ff4444" |
Color for slow tier |
colorMedium |
string | "#ffaa00" |
Color for medium tier |
colorFast |
string | "#00ff88" |
Color for fast tier |
colorBlazing |
string | "#44ddff" |
Color for blazing tier |
slidingWindow |
number | 1000 |
Sliding window duration in ms |
useProviderTokens |
boolean | false |
Opt-in: use provider-reported counts instead of this extension's own counter |
countStrategy |
estimate, direct |
direct |
Token counting strategy used by the extension's own counter |
Interactive Menu
A small interactive menu is available when running /tps in the editor, where you can adjust:
- Display mode — what to show in the status bar
- Use provider tokens — use provider-reported counts instead of the extension's counter
- Count strategy — how the extension counts tokens (
estimateordirect)
Sliding Window
The sliding window determines how many recent tokens are used to calculate TPS. A larger window produces smoother readings at the cost of responsiveness; a smaller window reacts faster but can be noisier.
| Server speed | Recommended window | Why |
|---|---|---|
| Fast (30+ tok/s) | 1000 (default) |
Plenty of tokens in the window — accurate and responsive |
| Medium (5–30 tok/s) | 1000–3000 |
Enough tokens for stable readings |
| Slow (< 5 tok/s) | 5000–15000 |
Captures more tokens, avoiding spiky or unreliable values |
For example, if your server streams at ~1 tok/s, a 10-second window gives ~10 tokens per window — enough for a reasonable calculation:
{
"tokenSpeed": {
"slidingWindow": 10000
}
}
Provider Token Counts
By default, this extension uses its own token counter — the same engine behind countStrategy. As an alternative, you can opt in to using the provider's own reported counts instead:
| Value | Behavior |
|---|---|
false (default) |
Use this extension's own counter (controlled by countStrategy) |
true |
Use the provider's reported counts instead; fall back to countStrategy when not available |
The extension's own counter is the default and always available. Enable useProviderTokens: true when your provider reports accurate token counts and you'd prefer to use them instead.
Count Strategy
When useProviderTokens is false (default) or when the provider doesn't report counts, the countStrategy determines how the extension's own counter works:
| Strategy | Behavior |
|---|---|
direct (default) |
Counts each delta as 1 token |
estimate |
Approximates tokens from delta text |
The direct strategy preserves the original behavior. Use estimate when your server streams in small chunks — it approximates the real token count from the delta text, giving a more meaningful TPS reading.
Display Modes
| Mode | Description |
|---|---|
tps |
⚡ TPS: 25.0 tok/s — TPS with color-coded speed tier |
ttft |
⚡ TPS: 25.0 tok/s (TTFT: 450 ms) — TPS + time-to-first-token |
stats |
⚡ TPS: 25.0 tok/s (150 tok in 6.0s) — TPS + token count and elapsed time |
full |
⚡ TPS: 25.0 tok/s (150 tok in 6.0s · TTFT: 450 ms) — everything |
Commands
| Command | Description |
|---|---|
/tps |
Open settings menu — configure display mode, provider tokens, and count strategy |
How It Works
- Session Start — Renders the initial status bar entry showing
⚡ TPS: -- - Message Start — When the assistant begins streaming, the engine starts tracking
- TTFT Measurement — When the user message starts, a timer begins. The moment the first token (text, thinking, or tool call) is emitted, the elapsed time is recorded as the time-to-first-token (TTFT) in milliseconds
- Token Update — Each text/thinking delta is recorded. If
useProviderTokensistrueand the provider reports token counts, those are used directly; otherwise the extension's own counter (controlled bycountStrategy) is used - Sliding Window — TPS is calculated using a configurable time window of token timestamps
- Message End — The authoritative token count (if available) is used to snap the total, ensuring the final average is exact
Dependencies
| Peer dependency | Purpose |
|---|---|
@earendil-works/pi-coding-agent |
Pi Coding Agent SDK |
@earendil-works/pi-tui |
Pi TUI SDK |