ollama-graceful
Pi extension that gracefully starts and stops the Ollama service on demand when switching between local and cloud models
Package details
Install ollama-graceful from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:ollama-graceful- Package
ollama-graceful- Version
0.2.3- Published
- Apr 19, 2026
- Downloads
- 590/mo · 15/wk
- Author
- 3mrgnc3
- License
- MIT
- Types
- extension
- Size
- 420.8 KB
- Dependencies
- 0 dependencies · 0 peers
Pi manifest JSON
{
"extensions": [
"./ollama-graceful.ts"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
ollama-graceful
Pi extension that gracefully manages the Ollama process lifecycle when switching between local and cloud models in pi.
Ollama is great for running local models, but leaving it running wastes GPU memory and power when you're using cloud models. This extension starts ollama serve automatically when you select a local model and kills it when you switch away — so your GPU is only busy when it needs to be.
Features
- Auto-start — spawns
ollama servewhen you switch to a local model via/modelorCtrl+P - Auto-stop — kills the process when you switch to a cloud model, freeing GPU/RAM
- Shutdown cleanup — stops Ollama when pi exits (if this extension started it)
- Readiness check — waits for Ollama API to be ready before the model is used (up to 30s)
- Status notifications — shows start/stop/ready status via pi's notification system
- Footer widget — displays
🦙 Ollama runningin the pi status bar while active - Manual commands —
/ollama-start,/ollama-stop,/ollama-statusfor on-demand control - No-op safety — never kills Ollama if it was already running before pi started
- No root required — runs entirely in user space, no sudo or systemd
Installation
pi install npm:ollama-graceful
Or from GitHub:
pi install git:github.com/3mrgnc3/ollama-graceful
Setup
1. Configure local models
Create or edit ~/.pi/agent/models.json to register your Ollama models:
{
"providers": {
"ollama": {
"baseUrl": "http://localhost:11434/v1",
"api": "openai-completions",
"apiKey": "ollama",
"compat": {
"supportsDeveloperRole": false,
"supportsReasoningEffort": false,
"supportsUsageInStreaming": false,
"maxTokensField": "max_tokens",
"supportsStrictMode": false
},
"models": [
{ "id": "qwen3.5:latest" }
]
}
}
}
2. Pull a model
ollama serve &
ollama pull qwen3.5
kill %1
3. Use
Open /model in pi and pick an Ollama model — the extension handles the rest. Switch to a cloud model and Ollama stops automatically.
Commands
| Command | Description |
|---|---|
/ollama-start |
Manually start the Ollama server |
/ollama-stop |
Manually stop the Ollama server |
/ollama-status |
Show current Ollama state |
How it works
/model → select ollama model
│
├─► model_select fires
├─► extension detects provider = "ollama"
├─► spawns `ollama serve` as child process
├─► polls localhost:11434 until ready (up to 30s)
├─► notify: "🦙 Ollama ready"
└─► widget: "🦙 Ollama running"
/model → select cloud model (or exit pi)
│
├─► model_select fires (or session_shutdown)
├─► extension detects provider ≠ "ollama"
├─► sends SIGTERM to child process (SIGKILL after 5s)
├─► notify: "🦙 Ollama stopped"
└─► widget cleared
The extension tracks whether it started Ollama itself. If Ollama was already running when you launched pi, the extension won't kill it — it only cleans up what it started.
License
MIT
