ollama-graceful

Pi extension that gracefully starts and stops the Ollama service on demand when switching between local and cloud models

Package details

extension

Install ollama-graceful from npm and Pi will load the resources declared by the package manifest.

$ pi install npm:ollama-graceful
Package
ollama-graceful
Version
0.2.3
Published
Apr 19, 2026
Downloads
590/mo · 15/wk
Author
3mrgnc3
License
MIT
Types
extension
Size
420.8 KB
Dependencies
0 dependencies · 0 peers
Pi manifest JSON
{
  "extensions": [
    "./ollama-graceful.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

ollama-graceful

Pi extension that gracefully manages the Ollama process lifecycle when switching between local and cloud models in pi.

Ollama is great for running local models, but leaving it running wastes GPU memory and power when you're using cloud models. This extension starts ollama serve automatically when you select a local model and kills it when you switch away — so your GPU is only busy when it needs to be.

demo

Features

  • Auto-start — spawns ollama serve when you switch to a local model via /model or Ctrl+P
  • Auto-stop — kills the process when you switch to a cloud model, freeing GPU/RAM
  • Shutdown cleanup — stops Ollama when pi exits (if this extension started it)
  • Readiness check — waits for Ollama API to be ready before the model is used (up to 30s)
  • Status notifications — shows start/stop/ready status via pi's notification system
  • Footer widget — displays 🦙 Ollama running in the pi status bar while active
  • Manual commands/ollama-start, /ollama-stop, /ollama-status for on-demand control
  • No-op safety — never kills Ollama if it was already running before pi started
  • No root required — runs entirely in user space, no sudo or systemd

Installation

pi install npm:ollama-graceful

Or from GitHub:

pi install git:github.com/3mrgnc3/ollama-graceful

Setup

1. Configure local models

Create or edit ~/.pi/agent/models.json to register your Ollama models:

{
  "providers": {
    "ollama": {
      "baseUrl": "http://localhost:11434/v1",
      "api": "openai-completions",
      "apiKey": "ollama",
      "compat": {
        "supportsDeveloperRole": false,
        "supportsReasoningEffort": false,
        "supportsUsageInStreaming": false,
        "maxTokensField": "max_tokens",
        "supportsStrictMode": false
      },
      "models": [
        { "id": "qwen3.5:latest" }
      ]
    }
  }
}

2. Pull a model

ollama serve &
ollama pull qwen3.5
kill %1

3. Use

Open /model in pi and pick an Ollama model — the extension handles the rest. Switch to a cloud model and Ollama stops automatically.

Commands

Command Description
/ollama-start Manually start the Ollama server
/ollama-stop Manually stop the Ollama server
/ollama-status Show current Ollama state

How it works

/model → select ollama model
  │
  ├─► model_select fires
  ├─► extension detects provider = "ollama"
  ├─► spawns `ollama serve` as child process
  ├─► polls localhost:11434 until ready (up to 30s)
  ├─► notify: "🦙 Ollama ready"
  └─► widget: "🦙 Ollama running"

/model → select cloud model (or exit pi)
  │
  ├─► model_select fires (or session_shutdown)
  ├─► extension detects provider ≠ "ollama"
  ├─► sends SIGTERM to child process (SIGKILL after 5s)
  ├─► notify: "🦙 Ollama stopped"
  └─► widget cleared

The extension tracks whether it started Ollama itself. If Ollama was already running when you launched pi, the extension won't kill it — it only cleans up what it started.

License

MIT