pi-ghost-autocomplete

Inline ghost-text autocomplete extension for the Pi coding agent (pi.dev). Cloud and local LLM providers, zero-flicker rendering, no popup interference.

Packages

Package details

extension

Install pi-ghost-autocomplete from npm and Pi will load the resources declared by the package manifest.

$ pi install npm:pi-ghost-autocomplete
Package
pi-ghost-autocomplete
Version
0.5.0
Published
Jun 19, 2026
Downloads
102/mo · 36/wk
Author
ngsoftware
License
SSPL-1.0
Types
extension
Size
1,014.2 KB
Dependencies
2 dependencies · 4 peers
Pi manifest JSON
{
  "extensions": [
    "./dist/index.js"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-ghost-autocomplete

Inline ghost-text autocomplete for the Pi coding agent (@earendil-works/pi-coding-agent).

While you type at the Pi prompt, an LLM predicts the most likely continuation of the current line and renders it as dim grey "ghost text" right of the cursor. Press Right Arrow to accept. The cursor never moves until you do, the existing slash-command popup keeps Tab, and a slow or unreachable backend silently disables itself rather than spamming the chat.

Four provider modes ship in the same package:

Mode Backend Default debounce Default model
cloud Groq / Cerebras / OpenRouter via @earendil-works/pi-ai 150 ms groq/llama-3.1-8b-instant
local Ollama / vLLM (OpenAI-compatible) 400 ms qwen2.5-coder:1.5b on http://localhost:11434
openai-compat Any OpenAI-compatible REST API (e.g. Mercury Edit) — also used as the default provider 150 ms mercury-edit-2 on https://api.inceptionlabs.ai/v1
race Multiple providers in parallel; first valid response wins 400 ms per-member defaults

Getting started

1. Get a Mercury Edit 2 API key

The default provider is Mercury Edit 2 by Inception Labs — a diffusion-based model optimised for low-latency code completions. Sign up at platform.inceptionlabs.ai to get an API key.

Set the INCEPTION_API_KEY environment variable so the extension can reach the API. Add it to your shell profile (e.g. ~/.bashrc, ~/.zshrc, or PowerShell $PROFILE) for persistence:

# ~/.bashrc or ~/.zshrc
export INCEPTION_API_KEY="your-key"
# PowerShell $PROFILE
$env:INCEPTION_API_KEY = "your-key"

Or pass it inline for a single session:

INCEPTION_API_KEY="your-key" pi
$env:INCEPTION_API_KEY = "your-key"; pi

To use a different provider instead (Groq, Cerebras, a local model, …), see Configuration.

2. Install the extension

pi install npm:pi-ghost-autocomplete

This registers the extension globally. Pi picks it up automatically on the next launch. To install it only for a single project, add -l:

pi install npm:pi-ghost-autocomplete -l   # writes to .pi/ in the current directory

3. Launch Pi and try it

pi

Start typing at the prompt. A dim ghost text should appear to the right of your cursor within ~150 ms. Press Right Arrow to accept it, or keep typing to dismiss it.

The first session feels "cold." The trie, cache, and profile all start empty, so every ghost is a fresh LLM call — slower and more generic. It gets noticeably faster and more accurate as you accept completions (the trie learns your patterns) and the cache warms up. Give it ~15–20 min of real use before judging quality.

Run /ghost to confirm the extension loaded and see the active configuration:

Pi Ghost: enabled
  minChars=3
  maxLineLength=240
  mode=openai-compat
  provider=mercury-edit
  baseUrl=https://api.inceptionlabs.ai/v1
  model=mercury-edit-2
  apiKey=4e24***
  debounceMs=150
  maxTokens=64
  maxRecentMessages=4

4. (Optional) Load the extension without installing

Pass -e to load the package for a single session without a permanent install:

pi -e npm:pi-ghost-autocomplete

Install

npm install pi-ghost-autocomplete

Configuration

All knobs are environment variables — no extra config file needed:

Variable Default Effect
PI_GHOST_DISABLED unset Set to 1 to disable on launch
PI_GHOST_PROVIDER unset (Mercury Edit 2) cloud, local, or router — set explicitly to switch away from Mercury Edit 2 (ignored when PI_GHOST_RACE_PROVIDERS is set). See Register routing for router
PI_GHOST_PROVIDER_NAME groq Cloud only (PI_GHOST_PROVIDER=cloud): any KnownProvider from pi-ai (groq, cerebras, openrouter, …)
PI_GHOST_MODEL provider-specific Model id, e.g. llama-3.1-8b-instant, qwen2.5-coder:1.5b
PI_GHOST_API_KEY env var lookup Cloud only: explicit API key (else pi-ai env lookup is used)
PI_GHOST_BASE_URL http://localhost:11434 Local only: Ollama / vLLM base URL
PI_GHOST_DEBOUNCE_MS 150 (cloud) / 400 (local) Inputs are debounced this long before a request is sent
PI_GHOST_MAX_TOKENS 64 (cloud) / 48 (local) Cap on completion tokens
PI_GHOST_RECENT 4 (cloud) / 2 (local) Number of recent user/assistant messages bundled as context
PI_GHOST_MIN_CHARS 3 Minimum draft length before a prediction fires
PI_GHOST_MAX_LINE 240 Predictions skipped if the current line exceeds this length
PI_GHOST_TEMPERATURE 0 Sampling temperature [0, 2]. 0 is best-effort deterministic; see note below
PI_GHOST_MERCURY_PARALLEL_N 1 Mercury FIM only: fire N parallel candidate completions for cycling. Requires PI_GHOST_TEMPERATURE > 0 (downgrades silently to 1 at temp=0). Costs scale linearly with N
PI_GHOST_CHAT_MODEL mercury-2 Router only: chat model id for the natural-language leg (/chat/completions)
PI_GHOST_CHAT_BASE_URL = FIM base URL Router only: override the chat leg's endpoint base (defaults to the same vendor as the FIM leg)
PI_GHOST_CHAT_API_KEY = Inception/FIM key Router only: override the chat leg's API key (defaults to the FIM leg's key)
PI_GHOST_CHAT_MAX_TOKENS = PI_GHOST_MAX_TOKENS Router only: cap on chat-leg completion tokens
PI_GHOST_CONTEXT_MAX_SCAN 200 M6b: cap on the candidate pool scanned for relevance ranking
PI_GHOST_CONTEXT_MAX_SELECTED 8 M6b: max number of selected context entries
PI_GHOST_CONTEXT_MAX_CHARS 4000 M6b: total char budget for the selected context window
PI_GHOST_CONTEXT_BM25_WEIGHT 0.3 M6b: weight of BM25 content score vs recency (0=pure recency, 1=pure relevance)
PI_GHOST_CONTEXT_TOOL_MAX 3 M6c: max number of tool entries to include in selected context (0 disables tools entirely)
PI_GHOST_FIM_CONTEXT_ALL_INTENTS unset Mercury FIM only: by default the conversation-context block is dropped from the FIM prompt for prose/question intents (a code-FIM edit model returns empty far more often when fed a large natural-language context — see session 51 investigation). Set to 1 to keep context for every intent (A/B against the empty-rate fix)
PI_GHOST_ALLOW_INSECURE unset Set to 1 to allow http:// provider base URLs on non-localhost hosts. Off by default — API keys + prompts would be sent in clear
PI_GHOST_PROFILE unset Set to off to disable persistent profile capture + reranking. Default: enabled. Profile is local-only — see Profile below
PI_GHOST_PROFILE_DECAY_DAYS 90 Days before an unused profile record's weight halves during compaction
PI_GHOST_PROVISIONAL unset Set to off to disable the 150 ms provisional fallback. Default: enabled
PI_GHOST_PROVISIONAL_PREFER unset Set to 1/true/on/yes to bias swap decisions toward the already-displayed provisional. Raises the LLM-vs-provisional swap margin from 0.1 to 0.3, reducing perceived flicker at the cost of recall. Bench reports a flicker=X% rate so this knob can be evaluated
PI_GHOST_MIN_CONFIDENCE 0.55 M3c.1: minimum reranker score for the top candidate to be shown. Range [0, 1]. Lower = more lenient, more ghosts shown
PI_GHOST_RERANK_WEIGHTS 0.45,0.25,0.15,0.15 M3c: reranker weights as logprob,profileBias,lengthPrior,agreement. Negative values clamped to 0
PI_GHOST_EVAL_CAPTURE unset Set to 1 to capture committed prompts to .pi/ghost-autocomplete/eval.jsonl for offline replay (pi-ghost-replay). Sensitive: contains raw prompt text. Off by default
PI_GHOST_REGRET_WINDOW_MS 3000 After accept, window in ms during which a >50% deletion of inserted text flags the record acceptRegret: true
PI_GHOST_SHORT_HOVER_MS 800 Enhancement #3: a ghost dismissed within this many ms is flagged shortHoverDismiss: true and triggers a soft trie penalty (magnitude 0.25). Tunes the bench short-hover rate
PI_GHOST_METRICS unset Set to 1 to write metrics to .pi/ghost-autocomplete/metrics.jsonl
PI_GHOST_DEBUG_LOG unset Set to 1 to write debug records to .pi/ghost-autocomplete/debug.jsonl

Note on temperature=0

temperature=0 is the default and asks the provider for greedy decoding so the same prefix produces the same completion. This is best-effort, not guaranteed: vendors batch requests across users and kernel scheduling differences can produce different greedy paths from one call to the next. The session's in-memory completion cache is the real determinism guarantee — retyping the same prefix in the same context will reuse the previous ghost rather than re-querying. Raise temperature only if you want cache-miss diversity (e.g. PI_GHOST_TEMPERATURE=0.5).

Run /ghost inside Pi to see the effective config; /ghost on and /ghost off toggle the feature for the current session.

Register routing

You are a developer typing a message to an AI coding agent — a request, an instruction, a question. A code fill-in-the-middle model (Mercury Edit), fed that message, tends to complete it like code: it writes the solution instead of finishing your sentence. That's the wrong register.

Router mode fixes this by splitting the work between two models by intent:

Intent Leg Why
command, path-ref, unclassified FIM (mercury-edit-2, /fim/completions) code/command/path drafts — what a code model is good at
instruction, question, prose chat (mercury-2, /chat/completions) finishes your message to the agent with a persona prompt — "continue the developer's request, don't perform it"

Both legs default to the same vendor (Inception) and the same key, so a single INCEPTION_API_KEY is enough. If the chat leg is unreachable or unconfigured, the router degrades to FIM-only — identical to the default, so enabling it never regresses.

# Single-vendor (Inception) — FIM for code, Mercury chat for instructions
PI_GHOST_PROVIDER=router INCEPTION_API_KEY=your-key pi ...

# Point the chat leg at a different model / endpoint / key
PI_GHOST_PROVIDER=router \
  INCEPTION_API_KEY=your-key \
  PI_GHOST_CHAT_MODEL=mercury-3 pi ...

See PI_GHOST_CHAT_* in the Configuration table for the chat-leg overrides.

If PI_GHOST_CHAT_MODEL is set to an id the endpoint doesn't recognise, the chat leg is preflighted at startup and Pi shows a one-shot warning naming the bad model — rather than silently falling back to the code model. In metrics, router events are tagged by the served register (ghost-router:chat vs ghost-router:fim), so pi-ghost-bench reports each leg's valid/bad-rate separately.

Race mode

Race mode fires requests to multiple providers simultaneously and uses the first valid response. This reduces perceived latency when one provider is slow or rate-limited.

# Race Groq against Cerebras
PI_GHOST_RACE_PROVIDERS=groq,cerebras pi ...

# Race Groq, Cerebras, and Mercury Edit (Inception Labs)
PI_GHOST_RACE_PROVIDERS=groq,cerebras,mercury-edit \
  INCEPTION_API_KEY=your-key pi ...

Race-specific environment variables:

Variable Default Effect
PI_GHOST_RACE_PROVIDERS unset Comma-separated list of providers to race (groq, cerebras, local, mercury-edit, mercury-edit-2, or any KnownProvider)
PI_GHOST_RACE_PAUSE_MS 400 Milliseconds before the second+ provider is unblocked
PI_GHOST_RACE_COOLDOWN_MS 1500 Quiet period after a winner before the next race starts
PI_GHOST_RACE_CUTOFF_MS 700 Maximum time to wait for any provider before giving up
PI_GHOST_RACE_<NAME>_MODEL provider default Override the model for a specific race member (e.g. PI_GHOST_RACE_GROQ_MODEL)
PI_GHOST_RACE_<NAME>_API_KEY env var lookup Override the API key for a specific race member
PI_GHOST_RACE_<NAME>_BASE_URL provider default Override the base URL for a specific race member
INCEPTION_API_KEY unset API key for Mercury Edit; used when mercury-edit or mercury-edit-2 appears in the race list
INCEPTION_BASE_URL https://api.inceptionlabs.ai/v1 Base URL override for Mercury Edit

<NAME> in the per-member variables is the provider name uppercased with non-alphanumeric characters replaced by _ (e.g. mercury-editMERCURY_EDIT).

Metrics and debug logging

Enable structured JSONL logs for acceptance tracking and performance analysis:

PI_GHOST_METRICS=1 pi ...        # writes .pi/ghost-autocomplete/metrics.jsonl
PI_GHOST_DEBUG_LOG=1 pi ...      # writes .pi/ghost-autocomplete/debug.jsonl

Metrics records include: timestamp, request id, provider latencies, completion mode (deterministic or llm), result (produced, empty, rejected, aborted, error), rejection reason, completion length, whether the ghost was shown, and acceptance outcome (accepted, dismissed, stale, expired). Raw prompt text and LLM output are never written to the metrics file. Both files rotate at 10 MB.

User profile

The extension keeps a long-lived JSONL store at .pi/ghost-autocomplete/profile.jsonl to learn slash-command, path, and trigram frequencies across sessions. The profile feeds three local-only signals:

  1. Slash fast-path. When the active line starts with /, the editor serves the most-typed extension as a ghost before the LLM is queried (mode trie, provider profile-slash). Press Right to accept.
  2. Profile-bias rerank (M3c hook). A profileBias score plugs into the multi-candidate reranker. Wired but only takes effect when the provider returns >1 candidate.
  3. Bounded path boost. Persistent path frequency adds a small capped boost to deterministic path/fuzzy ranking, capped so it cannot beat a fresh exact path-prefix match.

What's stored: paths (src/foo.ts), slash commands (/review), trigrams from committed prompts, and acceptance records keyed by the SHA-256 hash of the prefix. Raw prefixes and raw completions are never written. Token tuples are tokenized and capped at 4 for prefix tails / 8 for completions.

Sanitization runs before persistence: messages with API keys (sk-…, ghp_…, JWT, etc.) or high-entropy tokens (Shannon > 4.0 bits/char, length ≥ 20) are dropped entirely. The check is shared with the session trie (src/trie-sanitize.ts).

Rotation: when the JSONL exceeds 2 MB, the store compacts in place — sums weights per (kind, key), halves records older than 90 days (configurable via PI_GHOST_PROFILE_DECAY_DAYS), drops weights below 0.1. The original file is preserved as profile.jsonl.1.

Set PI_GHOST_PROFILE=off to disable capture + use entirely.

Benchmark CLI

The pi-ghost-bench binary reads a metrics JSONL file and reports p50/p95 latency, acceptance rates, and whether PRD thresholds pass:

pi-ghost-bench                                    # reads .pi/ghost-autocomplete/metrics.jsonl
pi-ghost-bench path/to/metrics.jsonl              # explicit path
pi-ghost-bench --json                             # machine-readable output

Exit code is 0 when all thresholds pass, 1 when any fail, 2 on I/O error or when the file contains no records.

How it works

  • The extension registers a GhostEditor via ctx.ui.setEditorComponent. GhostEditor extends Pi's CustomEditor so all built-in app keybindings (escape, ctrl+d, model switching, slash-command popup) keep working.
  • After every text change, a debounced request is sent through pi-ai's complete() with an AbortSignal that is canceled on the next keystroke.
  • Before calling the LLM, the editor checks the path index for a deterministic match. If the current token is a file path prefix or a fuzzy filename with a strong enough score, the deterministic result is shown immediately and the LLM call is suppressed.
  • The path index is built asynchronously at session start via git ls-files from the repository root, capped at 20 000 files. High-confidence basename-prefix matches (score ≥85) preempt the LLM unconditionally; lower- confidence fuzzy matches only activate when an action word (open, edit, read, show, view, find, check, look, load, import, include, see, run) precedes the token.
  • Recent agent messages come from ctx.sessionManager.getBranch() — only user/assistant text, no tool calls. Both providers truncate to a small, fixed payload so cloud requests don't ship the whole file.
  • Path mentions are extracted from recent conversation messages (slashed paths, backticked path-like tokens, and bare filenames with known code extensions). These give a ranking boost (+10 exact, +7 basename) to deterministic path completions, so recently discussed files surface higher.
  • For fuzzy filename matches where the typed token is not a prefix of the match, the editor enters replace mode: accepting the ghost replaces the typed token with the full matched path. A prefix is rendered before the ghost text to signal this behaviour. Prefix-suffix matches still append as usual.
  • render() searches for CURSOR_MARKER in the lines returned by super.render() and injects ANSI dim text directly after it. Pi's differential renderer redraws only the changed cells — no flicker, no full repaint.
  • Right Arrow is intercepted only if a ghost is shown, the popup autocomplete is not open, and the cursor sits at the end of the text. We use matchesKey(data, "right") from @earendil-works/pi-tui, which handles legacy CSI/SS3 sequences and the Kitty keyboard protocol uniformly, so kitty / WezTerm / iTerm2 (Kitty-mode) all work. Otherwise the input passes through to super.handleInput.
  • Ctrl+Right accepts the next word of the ghost (PRD §M3f). A "word" is a run of \w+ characters, or a single non-word non-whitespace char with optional leading whitespace — VS Code's Cursor Word Right semantics. Repeated presses peel the ghost off one token at a time; the final press finalises the metric as accepted-partial.
  • Alt+] / Alt+[ cycle through ranked candidates when the provider returns more than one (e.g. openai-compat with PI_GHOST_MERCURY_PARALLEL_N>1 and PI_GHOST_TEMPERATURE>0). Single-candidate sources (cache, trie, speculative warm, deterministic path, default LLM fire) treat both keys as no-ops. The selected candidate's index is logged as candidateRank on accept along with cycledCount.
  • Visible-column math uses string-width + strip-ansi. Ghost text is clipped grapheme-aware to the remaining columns on the current visual line so wide CJK characters and emoji never overflow the row.

Development

npm install
npm run typecheck   # tsc --noEmit against pi-tui / pi-coding-agent v0.74
npm test            # vitest, ~970 unit tests (requires Rust: builds native crate)
npm run build:dev   # plain tsc -> dist/ WITH source maps (local debugging)

The test suite and publish build delegate the crown-jewel logic (session trie, reranker, sanitize) to a native Rust binary, so Rust is a dev-time dependency (install via rustup). End users do not need Rust — they receive a prebuilt, platform-matched .node via optionalDependencies.

CI runs all three on Node 20 and 22 (.github/workflows/ci.yml).

Source protection (publish build)

This package ships compiled + obfuscated runtime, never raw TypeScript and never source maps. A private GitHub repo does not make an npm package private — anything in files is downloadable from the public registry — so the publish pipeline hardens the runtime against casual copying and reverse-engineering.

npm run build (and the auto-running prepack hook on npm pack / npm publish) runs scripts/build.mjs, which:

  1. builds the native crown-jewel binary (Rust, release, symbol-stripped) for the current platform and copies it to dist/ as pi-ghost-native.<triplet>.node (skip with PI_GHOST_SKIP_NATIVE=1) — this is for local dev runs only; the main package tarball excludes it (see the !dist/**/*.node negation in package.json "files"), since end users get the binary from a per-platform optionalDependency, never bundled in main;
  2. emits .d.ts declarations only (tsconfig.build.json) — no .js, no .js.map, no .d.ts.map;
  3. prunes internal declarations — keeps only the .d.ts transitively reachable from the public entries (index, bench) and deletes the rest. Crown-jewel modules (fast-path-engine, session-trie, reranker, provisional, regret-tracker, profile, edit-history-ring, …) are internal and never re-exported, so their type declarations are removed from the tarball — they would otherwise be a symbol-level roadmap into the obfuscated bundle. A dangling-reference guard fails the build if pruning would break consumer type resolution;
  4. bundles every entry point with esbuild into a single minified ESM chunk per entry (collapsing ~40 source modules into opaque bundles), keeping @earendil-works/* peers + string-width/strip-ansi external;
  5. obfuscates each bundle with javascript-obfuscator;
  6. re-adds exactly one #!/usr/bin/env node shebang to the two bin CLIs;
  7. fails loudly if any *.map reaches dist/.

Obfuscation is tuned to be latency-safe: this is a ghost-text engine where milliseconds matter, so control-flow flattening (the one transform with real runtime cost) is OFF. What's ON: hexadecimal identifier renaming, string-array encoding (base64) with two wrapper layers and hexadecimal indexes, string splitting (5-char chunks), numeric-literal-to-expression conversion (V8 constant-folds these at parse time, so zero runtime cost), object-key transformation, and light dead-code injection. Public export names are preserved so the pi extension loader and TS consumers keep working.

Tune the trade-off in OBFUSCATOR_OPTIONS inside scripts/build.mjs:

  • Smaller tarball / faster parse → lower deadCodeInjectionThreshold (e.g. 0) and stringArrayThreshold (e.g. 0.4). index.js is ~575 kB with the hardened settings; dropping dead-code injection brings it back near the ~110 kB pre-obfuscation size.
  • Stronger protection (test thoroughly first) → enable controlFlowFlattening, but expect a measurable latency hit on the autocomplete fast path.

What still ships readable: the .d.ts declarations for the public API only (internal module declarations are pruned in step 2), plus the JS wiring that calls into the native binary.

Native crown jewels (tier-C hardening)

The tuned algorithm values — the reranker weights (0.40/0.25/0.15/0.10/0.10), the prefix-aware lengthPrior curve, the confidence gates (0.55/0.05), the session-trie lookup gates + typeahead + decay, and the secret/volatile regex patterns — live only in a prebuilt, symbol-stripped native .node binary (native/, Rust + napi-rs). No tuned literal appears in the shipped JS: the wrapper fetches constants from the binary at load time, and the JS trie-sanitize.ts / session-trie.ts / reranker.ts are thin delegates. Feature extraction (merging, profile/edit-proximity closures) stays in JS because it needs JS closures; the scoring math crosses the boundary.

A reverser with the obfuscated bundle alone sees only calls into a native binary, not the weights or the curves as static literals. Reading them out of the binary requires disassembling the stripped Rust (Ghidra/IDA over hours-to-days), not pasting into obfuscator-io-deobfuscator or asking an LLM. This raises the cost of static recovery versus obfuscation alone — but see the honest residual below: the scoring is recoverable dynamically regardless of where the constants live.

Distribution. The binary ships per-platform as optionalDependencies (pi-ghost-autocomplete-<os>-<cpu>[-libc>), following the @napi-rs convention. End users on supported platforms need no Rust toolchain — npm installs only the matching prebuild. Unsupported platforms fail at load (by design: there is no recoverable JS fallback for the crown jewels). The trade-off is a per-platform CI/release matrix (.github/workflows/release.yml).

Honest residual. This is not cryptographic protection, and two of the "hidden" pieces are weaker than the binary boundary implies:

  • The reranker scoring is linear and black-box recoverable. The score is a weighted sum of per-axis normalized features, and the public API returns the per-axis scoreBreakdown alongside the final score. Feeding ~5+ candidate sets and solving the resulting linear system recovers the weight vector and gate thresholds without ever touching the binary — no disassembly needed. Hiding the literals raises the bar against copy-paste, not against a competitor who probes the I/O.
  • The secret/volatile regexes are public patterns. sk-ant, ghp_, AKIA, AIza, JWT shape, etc. are the same prefixes published by gitleaks / trufflehog. Moving them into the binary hides nothing of value; it's done for consistency, not secrecy.

What genuinely benefits from the boundary is the session-trie lookup algorithm shape (gates, typeahead-delta, decay, volatile verdict) — that structure is not trivially recoverable from I/O. Runtime instrumentation (--inspect, LD_PRELOAD, process.dlopen hooks) can still observe inputs/outputs of everything. What tier-C removes is the automated, minutes-to-recover static attack surface that pure JS obfuscation leaves. The real enforcement remains the SSPL-1.0 license; the native boundary removes the temptation for the honest-but-curious and raises the cost for a casual competitor from "paste into a deobfuscator" to "learn Rust disassembly (or write a solver)."

Platform support. Prebuilt binaries ship for win32-x64-msvc, darwin-arm64, linux-x64-gnu, linux-arm64-gnu. Intel Mac (darwin-x64) is not supported — Apple stopped selling Intel Macs in 2020; Intel Mac users can run via Rosetta 2. musl (Alpine) is not supported — there is no JS fallback, so unsupported platforms fail at load by design.

Verify what a release contains at any time:

npm pack --dry-run
# expect: obfuscated dist/*.js + public dist/*.d.ts + README only.
# must show NO *.map, NO raw *.ts source (note: *.d.ts declarations are fine
# and expected), and NO *.node (the binary ships via optionalDependencies).

Limitations

  • No native Ollama provider in pi-ai; we configure an openai-compat model pointed at Ollama's /v1 endpoint. Any OpenAI-compatible local server works the same way.
  • Pin to ^0.74 of pi-tui / pi-coding-agent / pi-ai / pi-agent-core (all under @earendil-works). The relevant interfaces are pre-1.0; expect occasional follow-up bumps.
  • Predictions are shown only at the end of the buffer to avoid splitting lines mid-stream.
  • The path index covers only tracked Git files. Untracked files are not indexed in v1.
  • Three-provider racing may create elevated cost or rate-limit pressure at high typing speeds.

License

SSPL-1.0 (Server Side Public License v1) — see LICENSE.