pi-auto-router
Multi-provider auto-router for pi coding agent with same-request failover, budget-aware routing, live usage pacing, and policy-based routing
Package details
Install pi-auto-router from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-auto-router- Package
pi-auto-router- Version
0.2.1- Published
- Jun 2, 2026
- Downloads
- not available
- Author
- danialr
- License
- MIT
- Types
- extension
- Size
- 311.8 KB
- Dependencies
- 0 dependencies · 2 peers
Pi manifest JSON
{
"extensions": [
"./index.ts"
],
"image": "https://imgzen.xyz/1780380056350-g5546y74.gif"
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pi-auto-router
pi-auto-router is a multi-provider auto-router for pi coding agent that keeps one stable set of Pi models while automatically failing over the same request across Claude, Gemini, Codex, DeepSeek, Ollama, and other configured targets.
It exposes opinionated routing profiles:
auto-router/subscription-reasoningauto-router/subscription-sweauto-router/subscription-long-contextauto-router/subscription-economyauto-router/subscription-fast

Install
pi install npm:pi-auto-router
Update
pi update npm:pi-auto-router
Try without installing
pi -e npm:pi-auto-router
Install from GitHub instead
pi install git:github.com/danialranjha/pi-auto-router
30-second quick start
Install the package:
pi install npm:pi-auto-routerReload pi with
/reloadOpen
/modelSelect one of:
auto-router/subscription-reasoningauto-router/subscription-sweauto-router/subscription-long-contextauto-router/subscription-economyauto-router/subscription-fast
Verify it is working:
/auto-router list /auto-router explain
Why people install it
- Same-request failover when a provider hits rate limits, overload, or transient errors
- Subscription-first routing so you can prefer bundled/OAuth access before per-token spend
- Budget- and quota-aware routing with daily/monthly budgets and live usage pacing, so the router backs off providers you’re burning through too quickly
- Policy-based model selection using shortcuts, intent heuristics, constraints, and route rules
- Stable Pi-facing model names so you can keep using one router profile instead of manually switching models
Highlights
- Subscription-first routing across multiple providers
- Same-request failover before substantive output starts
- Cooldown tracking for temporarily failing providers/models
- Circuit breaker pattern for repeatedly failing providers (closed→open→half-open)
- External JSON config for route definitions, aliases, and policy rules
- Intelligent routing policy engine — context analysis,
@shortcuts, capability/constraint solving, time-of-day/weekday rule conditions - Policy rules — force tiers, prefer/exclude providers, enforce billing/constraints, per-route scoping, dry-run traces
- Per-provider budget tracking with daily/monthly limits, persistent stats, and audit-driven failover
- Utilization Velocity Index (UVI) — real-time OAuth quota monitoring that adjusts routing priority on the fly
- Cost-aware ranking — estimated USD cost as secondary tiebreaker within latency-sorted UVI buckets
- Routing decision explainer so you can see why a target was selected
- Richer operator commands for status, route inspection, search, aliases, reloads, budgets, UVI, rules, circuit status, and explanations
Config file
auto-router reads its config from:
~/.pi/agent/extensions/auto-router.routes.json
If the file is missing or invalid, it falls back to built-in defaults.
A starter config is included in the repo as:
auto-router.routes.example.json
Copy it into place and customize:
mkdir -p ~/.pi/agent/extensions
cp auto-router.routes.example.json ~/.pi/agent/extensions/auto-router.routes.json
Example config
{
"routes": {
"subscription-reasoning": {
"name": "Reasoning & Agentic Router",
"reasoning": true,
"input": ["text", "image"],
"targets": [
{
"provider": "claude-agent-sdk",
"modelId": "claude-opus-4-7",
"label": "L1: Claude Opus 4.7 (Frontier)"
},
{
"provider": "google",
"modelId": "gemini-2.5-pro",
"label": "L2: Gemini 2.5 Pro (API Key)",
"billing": "per-token"
},
{
"provider": "openai-codex",
"modelId": "gpt-5.4",
"authProvider": "openai-codex",
"label": "L3: GPT-5.4"
},
{
"provider": "ollama",
"modelId": "glm-5.1:cloud",
"label": "L4: GLM-5.1 (Ollama Cloud Last Resort)"
}
]
}
},
"aliases": {
"reasoning": ["auto-router/subscription-reasoning"],
"swe": ["auto-router/subscription-swe"],
"claude": [
"claude-agent-sdk/claude-opus-4-7",
"claude-agent-sdk/claude-opus-4-6"
]
}
}
Target fields
Each route target supports:
provider— pi provider idmodelId— model id under that providerlabel— human-readable labelauthProvider— optional auth provider lookup in~/.pi/agent/auth.jsonbilling— optional:"per-token"for pay-per-token endpoints (default:"subscription")balanceEndpoint— optional custom balance API URL (falls back to built-in registry)
Use authProvider for providers whose OAuth/access token should be read from pi auth storage.
Skip it for providers that authenticate internally or don’t require pi-managed tokens for the request path.
For Gemini API-key routes, use your installed Gemini provider id (examples here use google), omit authProvider, set billing to "per-token", and provide GOOGLE_API_KEY or GOOGLE_KEY in the environment.
Commands
auto-router registers:
/auto-router/auto-router status/auto-router switch <route|alias|provider/model>/auto-router list/auto-router show <routeId>/auto-router search <query>/auto-router aliases/auto-router resolve <alias>/auto-router models/auto-router explain [routeId]— show the last routing decision (tier, target, confidence, reasoning)/auto-router shortcuts— list available@shortcuts/auto-router balance [show|fetch]— view/fetch balances for pay-per-token providers/auto-router budget [show|set <provider> <usd> [monthly]|clear <provider> [monthly]]— view/manage daily/monthly per-provider budgets/auto-router uvi [show|enable|disable|refresh]— view/manage Utilization Velocity Index monitoring/auto-router shadow [show|enable|disable]— run pipeline in shadow mode (log but don't change routing)/auto-router rules— show active policy rules and last applied strategy hints/auto-router circuit— show circuit breaker state for all providers/auto-router reload/auto-router reset— clears cooldowns, decision history, and budget warnings
Example operator flows
/auto-router switch reasoning
/auto-router switch claude
/auto-router switch subscription-swe
/auto-router list
/auto-router show subscription-reasoning
/auto-router search gemini
/auto-router aliases
/auto-router resolve reasoning
/auto-router explain
/auto-router shortcuts
/auto-router budget show
/auto-router budget set google 20.00 monthly
/auto-router budget set deepseek 20.00 monthly
/auto-router balance show
/auto-router balance fetch
/auto-router uvi show
/auto-router uvi enable
/auto-router reload
Troubleshooting with routing analytics scripts
The router also writes an append-only event log at:
~/.pi/agent/extensions/auto-router.events.jsonl
You can inspect that log with three repo scripts:
node scripts/routing-stats.mjs— top-level routing/event countersnode scripts/routing-quality-stats.mjs— feedback and quality breakdownsnode scripts/routing-session-stats.mjs— per-session routing behavior, UVI progression, failover drift, latency, and cost
routing-session-stats.mjs
Use this when you want to answer questions like:
- Is UVI actually changing provider selection?
- Which providers/models are dominating by day?
- Are failovers planner-driven or error-driven?
- What recurring provider errors are being masked by failover?
- Which model is faster or cheaper over the current window?
Basic usage:
node scripts/routing-session-stats.mjs
Useful filters:
# Last 14 section rows, top 5 models/providers per daily chart
node scripts/routing-session-stats.mjs --limit 14 --daily-top 5
# Only one route
node scripts/routing-session-stats.mjs --route subscription-swe
# Only recent activity
node scripts/routing-session-stats.mjs --since 2026-05-28T00:00:00
# JSON for further scripting
node scripts/routing-session-stats.mjs --json
What the report shows:
Daily routing composition— actual provider/model mix by daySession-start UVI timeline— latest local day’s UVI state over time, grouped by actual modelUVI selection mix by day— how much of each day ran underokvssurplusvs other UVI statesLatency distribution by model— how often each model landed in latency buckets (0-2s,2-5s, …)Cost distribution by model— how often each model landed in cost bucketsDrift overview— counts planner drift vs failover drift and the dominant drift codesTop drift-triggering errors— recurring upstream errors that caused failoverPlanned → actual drift— concrete routed requests where the final model differed from the planner’s first choice
Sample output (real troubleshooting use case):
Routing session stats from /Users/danial/.pi/agent/extensions/auto-router.events.jsonl
Sessions: 1349 success=99.0% failover=1.9% latency=8500ms ttft=4658ms cost=$0.0422
Daily routing composition (window: 2026-05-08T12:47:59 → 2026-05-29T22:20:10 (local))
2026-05-29 total=161 ███████████████████▓ █ openai-codex/gpt-5.4 92.5% | ▓ deepseek/deepseek-v4-flash 7.5%
providers=2 models=2 success=98.1% latency=8573ms
2026-05-28 total=100 ██████████████████▓▓ █ deepseek/deepseek-v4-flash 89.0% | ▓ openai-codex/gpt-5.4 11.0%
providers=2 models=2 success=98.0% latency=5073ms
Session-start UVI timeline (latest local day: 2026-05-29)
00h 04h 08h 12h 16h 20h 24h
┼───────────┼───────────┼───────────┼───────────┼───────────┼──────────┼
▓▓▓▓ ▓▓ ▓▓ openai-codex/gpt-5.4 n=149 12:07-22:20
▓ ▓ ▓▓ ▓ deepseek/deepseek-v4-flash n=12 10:24-20:16
legend: █ ok ▓ surplus ▒ stressed ░ critical ▁ unknown
Drift overview (window: 2026-05-08T12:47:59 → 2026-05-29T22:20:10 (local))
total=26 failover=26 planner=0
actual_cheaper n= 26 share=100.0% ████████████
actual_promoted n= 26 share=100.0% ████████████
failover_after_error n= 26 share=100.0% ████████████
rank_fallback n= 26 share=100.0% ████████████
Top drift-triggering errors (window: 2026-05-08T12:47:59 → 2026-05-29T22:20:10 (local))
n= 22 share=84.6% error=L3: GPT-5.4 (Alternative SOTA): Codex error: {"type":"error","error":{"type":"invalid_request_error","message":"Duplicate item found with id msg_3..."
How to use it to troubleshoot:
- Start with
Daily routing compositionto see which model/provider actually got traffic. - Check
Session-start UVI timelineandUVI selection mix by dayto see whether UVI state coincides with routing shifts. - If
Planned → actual driftis non-empty, inspectDrift overviewfirst:planner > 0suggests routing logic itself is choosing alternatesfailover > 0suggests runtime/provider errors are forcing the switch
- Use
Top drift-triggering errorsto find the dominant upstream/provider failure signature. - Compare
Latency distribution by modelandCost distribution by modelto decide whether a fallback provider is merely surviving errors or is also a better latency/cost target.
In practice, this script is best for debugging questions like:
- “Why did OpenAI end up on DeepSeek?”
- “Is UVI promotion actually changing traffic share?”
- “Are we masking a provider bug with failover?”
- “Should a fallback become a primary candidate?”
@ shortcuts
Prefix any prompt with one of these tokens to bias routing toward a specific tier. The shortcut is parsed off the front of the prompt (so the model never sees it) and translated into capability requirements before constraint solving:
| Shortcut | Tier | Effect |
|---|---|---|
@reasoning |
reasoning |
Requires reasoning-capable models |
@swe |
swe |
Requires reasoning-capable models (software-engineering oriented) |
@long |
long |
Requires contextWindow ≥ max(estimatedTokens, 100k) |
@vision |
vision |
Requires multimodal/vision-capable models |
@fast |
fast |
Hint only — currently does not constrain candidates |
Example:
@vision describe what's in this screenshot
@long summarize this 80-page document …
@reasoning prove that there are infinitely many primes
Use /auto-router explain after a request to see how the shortcut influenced the decision.
Intent classification
When no @ shortcut is used, the router automatically classifies your prompt into one of four categories using keyword/pattern heuristics:
| Intent | Routing hint | Trigger examples |
|---|---|---|
code |
swe tier |
"implement a function", "debug the error", code blocks, file paths |
creative |
economy tier |
"write a poem", "draft a blog post", "create a story" |
analysis |
long tier |
"analyze this code", "summarize the document", "compare X and Y" |
general |
(no hint) | Short prompts, greetings, meta-questions |
The intent classification appears in /auto-router explain reasoning (e.g. intent code (71%) → tier=swe). It runs instantly with zero latency — no LLM calls required.
Budgets
auto-router tracks daily and monthly input/output tokens and estimated cost per provider, persisted at:
~/.pi/agent/extensions/auto-router.stats.json
Daily budgets (subscription providers)
When you set a daily limit for a subscription provider, the budget auditor runs before each request:
- ≥ 80% of limit → soft warning (surfaces in routing reasoning and the status line)
- ≥ 100% of limit → that provider is excluded from the candidate set; routing falls back to the next allowed target
- If all candidates are over budget, routing falls back to the healthy list (so you’re never fully blocked) but the reasoning records the budget event
Manage budgets with:
/auto-router budget show
/auto-router budget set claude-agent-sdk 10.00
/auto-router budget set google 20.00 monthly
/auto-router budget clear openai-codex
Monthly budgets (per-token providers)
For pay-per-token providers like DeepSeek, set a monthly budget. The system auto-detects per-token providers when a monthly limit is set — no config tag needed:
/auto-router budget set deepseek 20.00 monthly
/auto-router budget clear deepseek monthly
The auditor uses the same thresholds (80% → warning, 100% → block) against monthly spend. Balance data is fetched from the provider's API (e.g. GET https://api.deepseek.com/user/balance) and API keys are resolved from ~/.pi/agent/auth.json first, then environment variables (DEEPSEEK_API_KEY, DEEPSEEK_KEY).
View balances with:
/auto-router balance show
/auto-router balance fetch
UVI for per-token providers
Per-token UVI is computed the same way as subscription UVI:
UVI = (monthly_spend / monthly_budget) / elapsed_fraction_of_month
This means per-token providers appear in /auto-router uvi show and the status line alongside subscription providers. Per-token UVI is always computed when a monthly budget is set, regardless of whether subscription UVI is enabled.
The selected target’s remaining budget is reported in decision.metadata.budgetRemaining and visible via /auto-router explain.
UVI interplay with budgets
When UVI is enabled, the budget auditor layers quota-based dynamic reallocation on top of USD limits:
| UVI status | Threshold | Effect |
|---|---|---|
critical |
UVI ≥ 2.0 | Blocks the provider — excluded from routing |
stressed |
UVI ≥ 1.5 | Demotes all targets from that provider to the end of the trial order |
surplus |
UVI ≤ 0.5 and window ≥ 70% elapsed | Promotes targets to the front of the trial order |
Critical UVI overrides a healthy USD budget. A provider with UVI=2.0 is blocked even if it's only spent $0.20 of a $10.00 daily limit.
UVI status also appears in /auto-router budget and /auto-router explain output.
Utilization Velocity Index (UVI)
UVI measures how fast you're consuming quota or budget and adjusts routing priority in real time. For subscription providers, it fetches usage data from provider quota APIs (openai-codex, anthropic). For per-token providers such as Gemini API-key routes or DeepSeek, it uses monthly spend vs. budget. UVI is computed as:
UVI = consumed_fraction / elapsed_fraction_of_window
- UVI ≈ 1.0 → on pace (e.g., 50% used at 50% elapsed)
- UVI ≥ 1.5 → burning fast — stressed (candidates demoted)
- UVI ≥ 2.0 → on track to exhaust early — critical (provider blocked)
- UVI ≤ 0.5 and window ≥ 70% elapsed → underutilized — surplus (candidates promoted)
Enabling / Disabling UVI
UVI is enabled by default. To opt out:
/auto-router uvi disable
# or set the environment variable:
# AUTO_ROUTER_UVI=0
Re-enable:
/auto-router uvi enable
UVI refreshes automatically before each prompt (throttled to once per 30 seconds). You can also force a refresh:
/auto-router uvi refresh
Viewing UVI state
/auto-router uvi show
Example output:
UVI (enabled):
anthropic UVI= 1.64 stressed | 5hr@38%, 7d@68%
openai-codex UVI= 0.81 ok | 1m@5%, 1d@61%
google UVI= 0.22 ok | monthly@18%
When a provider’s UVI is stressed or critical, it also appears in the status line:
| uvi: anthropic=1.64 stressed
Disabling
/auto-router uvi disable
Note: UVI requires valid OAuth tokens in ~/.pi/agent/auth.json. If a token is expired and can't be refreshed, that provider shows an error in uvi show.
UVI Hard Mode
By default, UVI uses a tiebreaker strategy: stressed providers are deprioritized but still tried if all other candidates fail. Enable hard mode to completely exclude stressed providers:
AUTO_ROUTER_UVI_HARD=1
When active, the status line shows 🛡️ uvi-hard. Demoted providers will not be tried at all — useful when you want strict quota protection near exhaustion. Surplus promotions still use tiebreaker ordering (promoted first, normal as fallback).
Shadow mode
Shadow mode runs the full routing pipeline (shortcut parsing, context analysis, constraint solving, budget auditing, UVI reordering) but uses legacy config-order targets for actual routing. This lets you validate new routing logic without affecting your experience.
/auto-router shadow enable
# or set the environment variable:
# AUTO_ROUTER_SHADOW=1
Once enabled, the status line shows 🔬 shadow. Use /auto-router shadow show to compare what the pipeline would have picked vs. what was actually used:
Shadow mode: 🟢 enabled
Last shadow comparison:
Route: subscription-reasoning
Pipeline would pick: Gemini 2.5 Pro → Claude Opus 4.6 → GPT-5.4
Actually used: Claude Opus 4.6 → Gemini 2.5 Pro → GPT-5.4
Match: ❌ different
Disable with /auto-router shadow disable.
Performance-based ranking
The router tracks per-provider request latency (time-to-response) using a rolling average and uses it as a tiebreaker within UVI buckets. Candidates are ordered:
- Promoted (UVI surplus), sorted fastest → slowest
- Normal, sorted fastest → slowest
- Demoted (UVI stressed), sorted fastest → slowest
Providers with no latency history sort last within their bucket (cold start). Data persists in ~/.pi/agent/extensions/auto-router.latency.json and survives restarts.
View latency data in /auto-router list (shows per-target ⏱ avg) and /auto-router explain (includes avg latency in reasoning). Reset with /auto-router reset.
User feedback
Rate routing decisions to help improve selection over time:
/auto-router rate good
/auto-router rate bad
/auto-router rate good "fast and accurate"
/auto-router rate bad "too verbose"
Ratings are persisted in ~/.pi/agent/extensions/auto-router.ratings.json. Per-provider stats appear in /auto-router explain (e.g. ratings: 12👍 3👎 (15 total, 80% good)). Reset with /auto-router reset.
Status line
The status line surfaces routing state at a glance:
auto-router Subscription Premium Router 🔬 shadow | tier=reasoning (0.90) | current: GPT-5.4 | healthy: …, … | ⚠ google: 87% of $20.00 monthly budget used | uvi: anthropic=1.64 stressed
🔬 shadowappears when shadow mode is enabledtier=<tier> (<confidence>)appears once a routing decision has been recorded⚠ …appears when one or more candidate providers are at 80%+ of their daily limituvi: …appears when one or more providers havestressedorcriticalUVI status
Behavior notes
- Only retryable errors trigger automatic failover
- Route targets that can’t be resolved from the registry are also treated as failoverable so the chain can keep moving
- Failover happens only before substantive output starts
- Once a provider/model emits real content, the router stays on that target
- Retryable failures put the target on a temporary cooldown
- Cooldowns are currently in-memory and reset on pi reload/restart
Default routes
The repository ships with opinionated defaults oriented around subscription-backed providers plus API-key Gemini and Ollama Cloud fallback:
- Claude Code via
claude-agent-sdk - OpenAI Codex
- Google Gemini via API key (
google, billed per-token) - NVIDIA DeepSeek (
deepseek-ai/deepseek-v3.2) - Ollama Cloud (
glm-5.1:cloud)
You should edit ~/.pi/agent/extensions/auto-router.routes.json to match your own environment.
Development
To work on auto-router in your local dev environment:
# 1. Clone the repo
git clone git@github.com:danialranjha/pi-auto-router.git
cd pi-auto-router
# 2. Install dependencies
npm install
# 3. Copy the example config into place
mkdir -p ~/.pi/agent/extensions
cp auto-router.routes.example.json ~/.pi/agent/extensions/auto-router.routes.json
# 4. Run pi with the local extension loaded
pi -e /absolute/path/to/pi-auto-router
After making changes to index.ts, reload the extension inside pi without restarting:
/auto-router reload
Use the built-in debug commands to verify routing and model resolution:
/auto-router status
/auto-router list
/auto-router debug
/auto-router test-resolve <alias>
Tests
The routing policy modules under src/ are covered by a node:test + tsx suite:
npm test
Architecture
The intelligent routing layer lives in src/ and is composed of small, focused modules:
| Module | Responsibility |
|---|---|
types.ts |
Shared types: Tier, RouteTarget, RoutingContext, RoutingDecision, RoutingHints, PolicyRuleConfig, etc. |
context-analyzer.ts |
Token estimation (chars/4), context classification (short/medium/long/epic), RoutingContext build |
shortcut-parser.ts |
Parses @reasoning/@swe/@long/@vision/@fast from prompts; strips the token before dispatch |
constraint-solver.ts |
Filters candidates by capability, cooldown, health, circuit breaker state, and tier-derived requirements |
policy-engine.ts |
Priority-ordered rule engine: 5 rule types (force-tier, prefer/exclude-provider, force-billing, force-constraint); time-of-day/weekday conditions; per-route scoping; dry-run traces via /auto-router explain |
budget-tracker.ts |
Persistent daily/monthly token/cost stats per provider with atomic writes; daily limits |
budget-auditor.ts |
Pure auditBudget(provider, state) returning ok | warning | blocked; integrates UVI for dynamic reallocation |
balance-fetcher.ts |
Fetches balances from pay-per-token providers (DeepSeek) with exponential backoff retry; builds synthetic monthly UVI windows |
uvi.ts |
Computes UVI from quota windows (consumed_fraction / elapsed_fraction); classifies as critical, stressed, ok, or surplus |
quota-fetcher.ts |
Pulls real-time usage data from OpenAI, Anthropic, and Google OAuth quota APIs; token refresh + error handling |
quota-cache.ts |
TTL-gated cache for quota snapshots; batches fetches, emits per-provider UtilizationSnapshot |
health-check.ts |
Provider health cache — verifies OAuth tokens; independent of UVI; feeds isHealthy into constraint solver |
circuit-breaker.ts |
Circuit breaker state machine (closed→open→half-open) for repeatedly failing providers; /auto-router circuit command + status line segment |
candidate-partitioner.ts |
Partitions candidates into [promoted, normal, demoted] buckets based on budget audit + UVI; supports hard mode exclusion; cost-aware secondary tiebreaker |
latency-tracker.ts |
Tracks per-provider request latency (rolling average, max 100 samples); used for performance-based ranking within UVI buckets |
intent-classifier.ts |
Heuristic intent classifier (code/creative/analysis/general) with file extension, documentation pattern, and conversation depth awareness |
feedback-tracker.ts |
User ratings of routing decisions (/auto-router rate); persists to auto-router.ratings.json; per-provider stats |
display.ts |
Pure display utilities: model spec parsing, target description, hints formatting, cooldown helpers, token normalization |
index.ts wires these together inside streamAutoRouter:
- Parse
@shortcut from the last user message - Build
RoutingContext(prompt, history, healthy targets, budget state, feedback stats) - Run PolicyEngine pre-constraint evaluation (tier overrides, provider exclusions, constraint tuning)
- Run
solveConstraintsover healthy targets with capability data from the model registry - Run
auditBudgetper remaining candidate; drop blocked, warn at 80%+; apply UVI-based demote/promote reordering - Run PolicyEngine post-partition hints (requireProvider, preferProviders sorting, cost tiebreaker)
- Order candidates:
[…promoted (surplus UVI), …normal, …demoted (stressed UVI)]with latency + cost sort - Record a
RoutingDecision(phase, tier, target, confidence, reasoning, estimated tokens, budget remaining, hints trace) - Stream from the selected target with same-request failover; circuit breaker tracks success/failure
Roadmap
High-priority future directions for pi-auto-router:
| Area | Feature | Priority |
|---|---|---|
| Policies | Feedback-driven rules — wire user ratings into PolicyEngine conditions | ⭐⭐⭐ |
| Architecture | Continue extracting from index.ts — testable modules for auth, config, cooldowns, model resolution |
⭐⭐⭐ |
| Testing | Performance benchmarks + Chaos testing for the hot path | ⭐⭐⭐ |
| Provider support | Provider-agnostic UVI — custom/self-hosted providers get quota awareness | ⭐⭐ |
| Config | JSON Schema validation + Export/import configs | ⭐⭐ |
| Advanced routing | Multi-step routing, Weighted A/B selection, ML intent classifier | ⭐ |
| Observability | Web dashboard / TUI integration, Resilience dashboard | ⭐ |
See ROADMAP.md for full details on each item.
License
MIT