pi-ollama-cloud-provider
Ollama Cloud provider extension for pi coding agent with dynamic model discovery
Package details
Install pi-ollama-cloud-provider from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-ollama-cloud-provider- Package
pi-ollama-cloud-provider- Version
0.1.0- Published
- May 2, 2026
- Downloads
- not available
- Author
- mario-gc
- License
- MIT
- Types
- extension
- Size
- 36.2 KB
- Dependencies
- 0 dependencies · 2 peers
Pi manifest JSON
{
"extensions": [
"./extensions/ollama-cloud/index.ts"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pi-ollama-cloud-provider
Ollama Cloud provider extension for pi coding agent with dynamic model discovery.
Features
- Dynamic model discovery — fetches all available Ollama Cloud models at startup
- Interactive management —
/ollama-cloudmenu for refresh, status, and cache inspection - Persistent cache — model details cached for 1 hour for instant subsequent startups
- Capability detection — reasoning (thinking) and vision support from
/api/show - Accurate context windows — extracted from model metadata, not hardcoded
- Fallback chain — when
/api/showfails, resolves from models.dev or name inference - Source tracking — cache records where each model's metadata came from
- Zero-cost tracking — Ollama Cloud uses flat subscription pricing
- OpenAI-compatible endpoint via
openai-completionsAPI
Installation
# npm (recommended — versioned, respects pi update)
pi install npm:pi-ollama-cloud-provider
# git (bleeding edge — always pulls main)
pi install git:github.com/mario-gc/pi-ollama-cloud-provider
# local path (development)
pi install /path/to/pi-ollama-cloud-provider
Quick Start
1. Get an API key
Sign up at ollama.com and generate an API key from your account settings.
2. Configure the API key
Option A: Set the environment variable:
export OLLAMA_CLOUD_API_KEY="your-key"
Option B: Add to ~/.pi/agent/auth.json:
{
"ollama-cloud": {
"type": "api_key",
"key": "your-key"
}
}
3. Select a model
Start pi and use /model, Ctrl+P (cycle), or Ctrl+L (list) to select an Ollama Cloud model. All available models appear under the ollama-cloud provider.
Available Models
Models are fetched dynamically from the Ollama Cloud API at startup. All available models are registered with accurate context windows and capability detection (reasoning, vision).
Run pi --list-models | grep ollama-cloud to see the full list.
The catalog includes models from various families: GLM, Qwen, DeepSeek, Kimi, GPT OSS, MiniMax, Gemma, Mistral, Nemotron, Cogito, Gemini, and more.
Commands
/ollama-cloud
Opens an interactive TUI menu with the following options:
| Option | Description |
|---|---|
| Refresh Models | Submenu to update model list |
| Status | View connection info, source breakdown, and cache status |
| Cache Info | Cache age, size, and model count |
Refresh Models Submenu
| Option | Description |
|---|---|
| From Ollama API | Fetches /api/show for all models, falls back to models.dev if needed (default) |
| From models.dev | Bypasses /api/show, uses models.dev metadata directly for all models |
After refresh, the menu shows the source breakdown: e.g., Registered 39 models (28 ollama, 10 modelsdev, 1 inference).
Status Submenu
Displays:
- Total registered models
- Source breakdown (how many models from Ollama API, models.dev, or inference)
- API endpoint URL
- Cache status (age, size, model count)
- Cache TTL (1 hour)
How it Works
Discovery Flow
On first startup (or when cache expires):
- Fetch model IDs —
GET https://ollama.com/v1/modelsreturns all available model IDs - Fetch per-model details —
POST https://ollama.com/api/showfor each model (parallel, 10s timeout each) - Extract metadata — context length from
model_info.*.context_length, capabilities fromcapabilitiesarray - Register provider — all models registered with pi under the
ollama-cloudprovider - Write cache — results cached to
~/.pi/agent/cache/ollama-cloud/models.json
Fallback Chain
If /api/show fails for a model (network issue, rate limit, new model not yet indexed), metadata is resolved through:
- https://models.dev/api.json — fetches the
ollama-cloudsection (cached separately for 24h). Only fetched when at least one model fails/api/show. - Name-based inference — pattern matching on model ID (e.g.,
kimi-*→ 262K context, reasoning enabled) - Safe defaults — 128K context, 32K max output, text-only, no reasoning
All fallback metadata uses zero cost since Ollama Cloud uses flat subscription pricing, not per-token billing.
Cache
| File | TTL | Purpose |
|---|---|---|
~/.pi/agent/cache/ollama-cloud/models.json |
1 hour | Raw /api/show responses per model |
~/.pi/agent/cache/ollama-cloud/models-dev.json |
24 hours | Full models.dev ollama-cloud section |
Each cache entry tracks its source: ollama (from /api/show), modelsdev (from models.dev), or inference (name-based).
Refresh Sources
The /ollama-cloud menu lets you choose the refresh source:
- From Ollama API — hits
/api/showfor all models, uses fallback chain for failures. Most accurate but slowest. - From models.dev — bypasses
/api/showentirely, uses models.dev metadata for all models. Fast, but may lack the latest models.
Configuration
Environment Variables
| Variable | Description | Default |
|---|---|---|
OLLAMA_CLOUD_API_KEY |
Ollama Cloud API key | (required) |
PI_CODING_AGENT_DIR |
Custom pi agent directory | ~/.pi/agent |
API Key Resolution
The extension resolves the API key in this order:
- Environment variable
OLLAMA_CLOUD_API_KEY ~/.pi/agent/auth.jsonentry forollama-cloud- pi's built-in auth storage
Troubleshooting
No models appear under ollama-cloud
- Check your API key is set:
echo $OLLAMA_CLOUD_API_KEY - Run
/ollama-cloud→ Status to verify connectivity - Try Refresh Models → From Ollama API
- Check pi's logs for error messages
Cache not working
- Check cache directory exists:
ls -la ~/.pi/agent/cache/ollama-cloud/ - Delete cache to force fresh fetch:
rm -rf ~/.pi/agent/cache/ollama-cloud/ - Restart pi
Models show incorrect context window
Context windows come from /api/show (primary) or models.dev (fallback). If you see unexpected values:
- Run Refresh Models → From Ollama API to get fresh data
- Check the Status submenu for source breakdown
"400 developer is not one of ['system', 'assistant', 'user', 'tool']" error
Some Ollama Cloud models may reject the developer message role. The extension sets compat.supportsDeveloperRole: false at the provider level to prevent this. If you still see this error, report it as an issue.
How is this different from ollama launch pi?
ollama launch pi is Ollama's built-in one-command setup that configures pi to talk to your local Ollama server. This extension takes a different approach: it connects pi directly to Ollama Cloud's hosted API at ollama.com.
ollama launch pi |
pi-ollama-cloud-provider |
|
|---|---|---|
| Provider name | ollama |
ollama-cloud |
| Endpoint | Local Ollama server (http://localhost:11434/v1) |
Ollama Cloud (https://ollama.com/v1) |
| Local models | Yes | No |
| Cloud models | Proxied through local server | Connected directly |
| Local Ollama required? | Yes | No |
| Authentication | Handled by local server | Ollama Cloud API key |
| Model discovery | ollama launch pi or --model qwen3.5:cloud |
Dynamic — fetches all available cloud models |
| Use when | You're running Ollama locally and want the default experience | You want direct cloud access without a local server |
You can use both at the same time. The providers live under different names, so you can switch between them with /model, Ctrl+P, or Ctrl+L.
Contributing
Contributions are welcome! Please open an issue or pull request on GitHub.
Development
# Clone the repo
git clone https://github.com/mario-gc/pi-ollama-cloud-provider.git
cd pi-ollama-cloud-provider
# Install dependencies
npm install
# Test locally
pi install /path/to/pi-ollama-cloud-provider
Project structure
├── extensions/
│ └── ollama-cloud/
│ ├── index.ts # Entry point, command registration, main menu
│ ├── discovery.ts # API fetch, model assembly, provider registration
│ ├── cache.ts # Persistent cache with TTL and source tracking
│ ├── fallback.ts # models.dev fetch, name inference
│ └── menu.ts # Interactive TUI menu with SettingsList
├── package.json # pi package manifest + release-it config
├── CHANGELOG.md # Auto-generated by release-it
└── README.md # This file
License
MIT