pi-freerouter
Pi coding agent extension — auto-routes to free OpenRouter models
Package details
Install pi-freerouter from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-freerouter- Package
pi-freerouter- Version
0.1.13- Published
- Jun 7, 2026
- Downloads
- not available
- Author
- parkjangwon
- License
- MIT
- Types
- extension
- Size
- 170.7 KB
- Dependencies
- 0 dependencies · 1 peer
Pi manifest JSON
{
"extensions": [
"./index.ts"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pi-freerouter
Pi coding agent extension that routes every request through OpenRouter's free model tier — no paid API key needed beyond your OpenRouter account.
Quick start
1. Install
pi install npm:pi-freerouter
2. Set your OpenRouter API key
export OPENROUTER_API_KEY=sk-or-...
Free key at openrouter.ai/keys.
3. Start Pi and select FreeRouter
FreeRouter appears in the model picker as auto [free-router]. Select it when
you want requests routed through OpenRouter's free model tier. If
OPENROUTER_API_KEY is not set, Pi still starts normally; pi-freerouter shows
the key error only after you send a prompt with auto [free-router] selected.
How free model routing works
OpenRouter exposes dozens of free models (models with a :free suffix). Each has its own rate limit — typically a few requests per minute per model. The trick is to spread load across all of them automatically.
Parallel racing
Every time Pi sends a request, pi-freerouter doesn't pick one model and hope for the best. It picks the next 3 available models from its sorted list and fires all three requests simultaneously.
Request arrives
│
├── model A ──────────────────── first token ──▶ WINNER → stream to Pi
├── model B ───── (slower) → aborted
└── model C ─── (even slower) → aborted
Whichever model emits its first token wins. The other two are immediately cancelled. Pi sees a single clean stream — it has no idea a race happened.
Automatic fallback
Failed models are skipped for a short cooldown period and then return to the pool:
| Failure | Cooldown |
|---|---|
| Rate limit (429) or server error (5xx) | 90 s |
| No first token within 30 s | 15 s |
| Request rejected (400/422) | 90 s |
Batch 1: [model A, model B, model C] → all hit quota
Batch 2: [model D, model E, model F] → model D wins
↑ model A–C recover after 90s and rejoin the pool
Each model is tried at most once per request. Once a winner is streaming, a 30-second idle window per chunk ensures a stalled connection is aborted promptly rather than left open.
Provider priority
Free models are sorted so the lowest-latency inference providers are always tried first:
- Groq
- Cerebras
- Fireworks
- Together
- Mistral
- Everything else (sorted by context window ascending)
Model list refresh
The list of available free models is fetched when you first send a prompt with
auto [free-router] selected, then refreshed every hour in the background.
Requirements
- Pi coding agent v0.78+
- OpenRouter API key (free tier is sufficient)
License
MIT