@aittalam/pi-llamafile
Pi extension that supervises local llamafile-served model processes — start, stop, adopt, with progress visible on quit
Package details
Install @aittalam/pi-llamafile from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:@aittalam/pi-llamafile- Package
@aittalam/pi-llamafile- Version
1.0.0- Published
- May 21, 2026
- Downloads
- not available
- Author
- aittalam
- License
- MIT
- Types
- extension
- Size
- 76.1 KB
- Dependencies
- 0 dependencies · 1 peer
Pi manifest JSON
{
"extensions": [
"./index.ts"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
Llamafiles provider extension for pi
A pi extension that supervises locally-run llamafile-style model servers
(or any OpenAI-compatible server you can launch from a binary). It
registers a llamafiles provider with pi, starts the configured binary
when you pick one of its models, and stops it when you switch away or
quit.
Status
Implementation complete; covered by 55 automated tests (42 unit + 13
integration via the SDK driver). npm test runs in ~15s and exits 0.
See SPECS.md for the behavioral contract, PLAN.md
for the implementation plan, and NOTES.md for the pi API
findings that informed the design.
Features
- Process supervision — starts the configured binary on
/model, waits for/v1/modelsto respond, then reports ready. One process per pi session. - Per-model port — each model declares its own
port; pi sends requests there. Default8080. {{port}}substitution inargs—portis the single source of truth; reference it in your binary's arg list as{{port}}.- Adoption — if a compatible server is already running on the port, pi adopts it instead of spawning a duplicate.
- Foreign-port safety — if a different server holds the port, pi surfaces the conflict and tells you to free it manually. It never kills processes it did not start.
- Visible quit progress — when you exit pi while it owns a running process, the extension prints "Stopping llamafile ..." and "Stopped llamafile ..." to stderr so the user can see the wait.
- Transparent reload —
/reloaddoes not prompt; the process keeps running and the freshly loaded extension instance re-adopts it. - Cleanup-by-design — adopted processes are never stopped without your consent.
Installation
Three options, in order of recommendation:
From npm (most convenient, gets gallery indexing):
pi install npm:@aittalam/pi-llamafile
From git (pins to a tag, no npm account needed by you or by pi):
pi install git:github.com/aittalam/pi-llamafile@v1.0.0
From source (for hacking on the extension):
git clone https://github.com/aittalam/pi-llamafile ~/.pi/agent/extensions/pi-llamafile
cd ~/.pi/agent/extensions/pi-llamafile
npm install
In all cases, use /reload from a running pi session, or restart pi, to pick
up changes.
Configuration
Define your llamafile models in ~/.pi/agent/models.json under the
llamafiles provider:
{
"providers": {
"llamafiles": {
"models": [
{
"id": "qwen3-9b",
"name": "Qwen3 9B",
"command": "sh",
"args": [
"/path/to/qwen3-9b.llamafile",
"--server",
"--port",
"{{port}}",
"--jinja"
],
"port": 8080,
"reasoning": false,
"input": ["text"],
"contextWindow": 32768,
"maxTokens": 4096
}
]
}
}
}
Optional per-model fields beyond pi's standard set:
| Field | Description |
|---|---|
command |
Executable to spawn. Required. |
args |
Argument list. {{port}} is substituted at spawn. |
port |
TCP port the server listens on. Default 8080. |
env |
Extra environment variables for the spawned process. |
cwd |
Working directory for the spawned process. |
See SPECS.md §3.2 for the complete schema and defaults.
Usage
pi --list-modelsshould show your llamafile models under thellamafilesprovider.- Use
/modelto pick one. The extension spawns the binary, waits for readiness, and reports<name> is ready. - Switch with
/modelagain. The current process is stopped and the new one started. - Switch to any non-llamafile model: the running llamafile is stopped.
/quit(orCtrl+D): the running llamafile is stopped. Two lines appear on the terminal (stderr):Stopping llamafile "<name>" ...thenStopped llamafile "<name>".Adopted servers (started outside pi) are left running silently./llamafiles: prints the current state.
Development
npm install # install dev deps
npm test # 56 tests, unit + integration
npm run test:unit # 45 unit tests, sub-second
npm run dev # pi -e . — smoke-load against your real HOME
Logs from spawned binaries are appended to ~/.pi/llamafile_logs/<modelId>.log.
Layout
SPECS.md # behavioral contract (single source of truth)
PLAN.md # implementation plan
NOTES.md # pi API findings
index.ts # thin wiring: events, commands
src/
config.ts # models.json loader
template.ts # {{port}} substitution
process.ts # LlamafileSupervisor
log.ts # file log streams
notify.ts # notification text
types.ts # shared types
tests/
unit/ # 3 files, 42 tests
integration/ # 13 files, all SDK-driven
helpers/ # fake-server, harness, port allocator
MANUAL.md # checklist for things automation cannot reach
License
MIT.