ollama-graceful

Pi extension that gracefully starts and stops the Ollama service on demand when switching between local and cloud models

Package details

← Back

extension

Install ollama-graceful from npm and Pi will load the resources declared by the package manifest.

npm report

$ pi install npm:ollama-graceful

Package: ollama-graceful
Version: 0.2.3
Published: Apr 19, 2026
Downloads: 590/mo · 15/wk
Author: 3mrgnc3
License: MIT
Types: extension
Size: 420.8 KB
Dependencies: 0 dependencies · 0 peers

Pi manifest JSON

{
  "extensions": [
    "./ollama-graceful.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

ollama-graceful

Pi extension that gracefully manages the Ollama process lifecycle when switching between local and cloud models in pi.

Ollama is great for running local models, but leaving it running wastes GPU memory and power when you're using cloud models. This extension starts ollama serve automatically when you select a local model and kills it when you switch away — so your GPU is only busy when it needs to be.

Features

Auto-start — spawns ollama serve when you switch to a local model via /model or Ctrl+P
Auto-stop — kills the process when you switch to a cloud model, freeing GPU/RAM
Shutdown cleanup — stops Ollama when pi exits (if this extension started it)
Readiness check — waits for Ollama API to be ready before the model is used (up to 30s)
Status notifications — shows start/stop/ready status via pi's notification system
Footer widget — displays 🦙 Ollama running in the pi status bar while active
Manual commands — /ollama-start, /ollama-stop, /ollama-status for on-demand control
No-op safety — never kills Ollama if it was already running before pi started
No root required — runs entirely in user space, no sudo or systemd

Installation

pi install npm:ollama-graceful

Or from GitHub:

pi install git:github.com/3mrgnc3/ollama-graceful

Setup

1. Configure local models

Create or edit ~/.pi/agent/models.json to register your Ollama models:

{
  "providers": {
    "ollama": {
      "baseUrl": "http://localhost:11434/v1",
      "api": "openai-completions",
      "apiKey": "ollama",
      "compat": {
        "supportsDeveloperRole": false,
        "supportsReasoningEffort": false,
        "supportsUsageInStreaming": false,
        "maxTokensField": "max_tokens",
        "supportsStrictMode": false
      },
      "models": [
        { "id": "qwen3.5:latest" }
      ]
    }
  }
}

2. Pull a model

ollama serve &
ollama pull qwen3.5
kill %1

3. Use

Open /model in pi and pick an Ollama model — the extension handles the rest. Switch to a cloud model and Ollama stops automatically.

Commands

Command	Description
`/ollama-start`	Manually start the Ollama server
`/ollama-stop`	Manually stop the Ollama server
`/ollama-status`	Show current Ollama state

How it works

/model → select ollama model
  │
  ├─► model_select fires
  ├─► extension detects provider = "ollama"
  ├─► spawns `ollama serve` as child process
  ├─► polls localhost:11434 until ready (up to 30s)
  ├─► notify: "🦙 Ollama ready"
  └─► widget: "🦙 Ollama running"

/model → select cloud model (or exit pi)
  │
  ├─► model_select fires (or session_shutdown)
  ├─► extension detects provider ≠ "ollama"
  ├─► sends SIGTERM to child process (SIGKILL after 5s)
  ├─► notify: "🦙 Ollama stopped"
  └─► widget cleared

The extension tracks whether it started Ollama itself. If Ollama was already running when you launched pi, the extension won't kill it — it only cleans up what it started.

License

MIT