pi-high-availability

High Availability extension for pi - automatic failover when quota or capacity is exhausted

Package details

← Back

extension

Install pi-high-availability from npm and Pi will load the resources declared by the package manifest.

npm repo home report

$ pi install npm:pi-high-availability

Package: pi-high-availability
Version: 2.3.0
Published: Mar 19, 2026
Downloads: 56/mo · 18/wk
Author: burggraf
License: MIT
Types: extension
Size: 52.2 KB
Dependencies: 0 dependencies · 2 peers

Pi manifest JSON

{
  "extensions": [
    "extensions/index.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-high-availability 🔄

pi-high-availability automatically switches to fallback LLM providers when your primary provider hits quota limits or capacity constraints. Never get stuck waiting for quota resets again.

🆕 What's New in v2.3.0

Password-Store Integration — You can now use !pass references for API keys stored in password-store. The /ha UI automatically discovers matching pass entries and shows them as one-click options. API key input is now masked with a hint for !pass syntax.

Usage

# In /ha UI, you'll see matching pass entries like:
#   🔑 Use: api/anthropic/key
# Click to add as a credential

# Or manually enter:
!pass show api/anthropic/key

The !pass reference is stored in auth.json and resolved by pi at API call time.

v2.2.0

Per-Session Group Selection — You can now specify which HA group to use via the --ha-group CLI flag. This is useful when running multiple pi instances with different failover chains.

Usage

# Use default group (from ha.json defaultGroup)
pi -e .pi/extensions/gastown-hooks.js

# Use specific group via --ha-group
pi -e .pi/extensions/gastown-hooks.js --ha-group paid --model openai-codex/gpt-5.3-codex

This allows you to configure different gastown workers to use different HA groups based on their role.

How It works

When pi starts with --ha-group paid:

Extension reads the flag value (paid)
Validates the group exists in ha.json
Sets state.activeGroup = "paid"
All failover events use models from the "paid" group

When pi starts without --ha-group:

Falls back to defaultGroup from ha.json
All failover events use models from that group

Configurable Error Handling — You can now control how the extension responds to different types of errors:

Capacity Errors (e.g., "out of capacity", "engine overloaded"): These affect all accounts for a provider equally, so switching accounts doesn't help. Now you can choose to stop, retry after a timeout, or jump to next_provider.
Quota Errors (e.g., "rate limit exceeded", "insufficient quota"): These are per-account, so switching to another OAuth key or API key may solve the problem. Choose from stop, retry, next_provider, or next_key_then_provider (default).

Configure these in /ha under ⚙️ Settings or directly in ha.json (see Error Handling Configuration).

✨ Features

Unified HA Manager: A beautiful interactive TUI (/ha) with accordion-style navigation to manage all your groups and credentials in one place.
Automatic Multi-Tier Failover:
1. Account Failover: Seamlessly switches between multiple accounts for the same provider.
2. Provider Failover: Automatically jumps to the next provider in your group if all accounts for the current provider are exhausted.
Exhaustion Tracking: Intelligent cooldown management marks specific accounts or providers as "exhausted" on 429/capacity errors, preventing retries until they recover.
Dynamic Provider Discovery: Automatically detects all supported Pi providers (Anthropic, OpenAI, Gemini, Moonshot, Zai, etc.) without configuration.
Group Management: Create custom failover chains (e.g., "Fast Tier" → "Backup Tier") and rearrange model priority with simple keybindings.
Credential Sync & Storage: Automatically capture OAuth logins or manually add API keys for backup accounts.
Smart Error Detection: Distinguishes between quota errors and transient capacity issues, including full support for Google Gemini's internal retry patterns.

🚀 Quick Start

1. Install the Extension

pi install npm:pi-high-availability

2. Open the Manager

Run the High Availability manager to initialize your configuration:

/ha

3. Configure Your First Group

Select 📂 Groups.
Add or select a group (e.g., default).
Add Model IDs (e.g., anthropic/claude-3-5-sonnet) to the group.
Use u and d keys to rearrange the priority.

🎮 The HA Manager (`/ha`)

The interactive manager is your control center for high availability.

Keyboard Navigation

Key	Action
`↑` / `↓`	Navigate items
`Space` / `→`	Expand/collapse section or toggle item
`Enter`	Select/activate item
`x` / `d` / `Delete`	Delete currently selected item (with confirmation)
`u`	Move item up (reorder)
`d`	Move item down (reorder)
`Esc`	Cancel / Exit

📂 Group Management

Add/Rename/Delete groups.
Rearrange Priority: Use u (up) and d (down) keys to set the failover order of models within a group.
Per-Entry Cooldown: Set custom recovery times for specific models.
Delete Models: Navigate to any model entry and press x to remove it from the group.

🔑 Credential Management

Auto-Sync: Credentials from /login are automatically synced when you open /ha.
Add API Providers: Use "+ Add API Provider" to manually add providers that use API keys.
Add API Keys: For non-OAuth providers, add additional API keys as backups.
Account Priority: Use u and d keys to decide which account is primary and which are backup-1, backup-2, etc.
Delete Keys: Navigate to any key entry and press x to delete it.
Delete Providers: Navigate to a provider header (e.g., 🔌 google-gemini-cli) and press x to delete the entire provider and all its keys.

⏱️ Settings

Default Cooldown: Set the default recovery time (e.g., 3600000ms for 1 hour) for exhausted providers.
Default Group: Choose which failover chain Pi uses when it starts up.
Error Handling: Configure how different error types are handled:
- Capacity Error Action: What to do when a provider reports "out of capacity" (doesn't help to switch accounts for the same provider)
- Quota Error Action: What to do when a provider reports quota/rate limit exceeded (switching accounts may help)
- Retry Timeout: How long to wait before retrying when using "retry" action (default: 300000ms = 5 minutes)

🔍 How Failover Works

The Failover Chain

When a quota or capacity error is detected:

Try Next Account: The extension looks for another credential for the same provider (e.g., your second Google account).
Mark Exhausted: The current account is marked as exhausted and won't be used again until its cooldown expires.
Switch Provider: If all accounts for that provider are exhausted, the extension looks at the Active Group and switches to the next provider/model in the list.
Automatic Retry: Pi automatically resends your last message using the new provider and primary account, making the transition transparent.

Error Detection

The extension detects:

Quota Errors: HTTP 429, "rate limit", "insufficient quota", etc.
Capacity Errors: "No capacity available", "Engine Overloaded", etc.
Gemini Awareness: Correctly waits for Google's internal retry attempts before triggering a failover.

⚙️ Configuration Guide (`ha.json`)

While you should use the /ha UI, you can also manually edit ~/.pi/agent/ha.json:

{
  "groups": {
    "pro": {
      "name": "Professional Tier",
      "entries": [
        { "id": "anthropic/claude-3-5-sonnet" },
        { "id": "google-gemini-cli/gemini-1.5-pro", "cooldownMs": 1800000 }
      ]
    }
  },
  "defaultGroup": "pro",
  "defaultCooldownMs": 3600000,
  "errorHandling": {
    "capacityErrorAction": "next_provider",
    "quotaErrorAction": "next_key_then_provider",
    "retryTimeoutMs": 300000
  },
  "credentials": {
    "anthropic": {
      "primary": { "type": "oauth", "refresh": "...", "access": "..." },
      "backup-1": { "type": "api_key", "key": "..." }
    }
  }
}

Error Handling Configuration

The errorHandling section in ha.json lets you customize how the extension responds to different error types:

Setting	Description	Default
`capacityErrorAction`	Action when provider has no capacity (affects all accounts)	`next_key_then_provider`
`quotaErrorAction`	Action when account hits rate limit (may not affect other accounts)	`next_key_then_provider`
`retryTimeoutMs`	How long to wait before retrying (in milliseconds)	`300000` (5 minutes)

Understanding the Error Types

Capacity Errors occur when a provider's servers are overloaded. Examples:

"No capacity available for this model"
"Engine overloaded"
"Service temporarily unavailable"

These errors affect the provider's infrastructure, so switching to a different account for the same provider typically won't help. Recommended action: next_provider or retry.

Quota Errors occur when an account exceeds its limits. Examples:

"Rate limit exceeded (429)"
"Insufficient quota"
"Daily limit reached"

These errors are per-account, so switching to another OAuth entry or API key for the same provider may solve the problem. Recommended action: next_key_then_provider (default) or next_provider if you don't have backup accounts.

Available Actions

The following actions can be configured for both capacityErrorAction and quotaErrorAction:

Action	Description
`stop`	Stop the process and display the error (default if pi-high-availability is not installed)
`retry`	Wait for `retryTimeoutMs` milliseconds, then retry the same request
`next_provider`	Immediately switch to the next provider in the current group
`next_key_then_provider`	Try the next account/key for the current provider, then move to next provider if all exhausted (default)

Note: For capacity errors, next_key_then_provider is often not helpful since all accounts for the same provider typically share the same capacity pool. Use next_provider or retry for capacity errors instead.

📄 License

MIT

🤝 Contributing

Contributions welcome! Please open an issue or PR on GitHub.

🙏 Credits

Built for the pi coding agent community.

pi-high-availability 🔄

🆕 What's New in v2.3.0

Usage

v2.2.0

Usage

How It works

✨ Features

🚀 Quick Start

1. Install the Extension

2. Open the Manager

3. Configure Your First Group

🎮 The HA Manager (/ha)

Keyboard Navigation

📂 Group Management

🔑 Credential Management

⏱️ Settings

🔍 How Failover Works

The Failover Chain

Error Detection

⚙️ Configuration Guide (ha.json)

Error Handling Configuration

Understanding the Error Types

Available Actions

📄 License

🤝 Contributing

🙏 Credits

🎮 The HA Manager (`/ha`)

⚙️ Configuration Guide (`ha.json`)