@jmcombs/pi-qwen-guard

Auto-enables strict incremental mode for Qwen 3.6 (Ollama) to prevent 'terminated' and streaming errors.

Packages

Package details

extension

Install @jmcombs/pi-qwen-guard from npm and Pi will load the resources declared by the package manifest.

$ pi install npm:@jmcombs/pi-qwen-guard
Package
@jmcombs/pi-qwen-guard
Version
1.0.0
Published
May 31, 2026
Downloads
246/mo · 25/wk
Author
jmcombs
License
MIT
Types
extension
Size
8.3 KB
Dependencies
0 dependencies · 2 peers
Pi manifest JSON
{
  "image": "https://raw.githubusercontent.com/jmcombs/pi-extensions/main/assets/qwen-guard/preview.png",
  "extensions": [
    "./index.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

@jmcombs/pi-qwen-guard

Automatically detects Qwen 3.6 (or any Qwen model via Ollama) and injects strict incremental-mode rules to prevent "error: terminated" and "Stream ended without finish_reason".

Just install and forget — works on every session.

Quick Start

pi install @jmcombs/pi-qwen-guard

The guard activates silently the moment you start a session with any Qwen model. No commands, no configuration, no secrets.

How It Works

On session_start:

  • Inspects ctx.model.id.
  • If it contains "qwen", sets an internal flag and shows a one-time success notification:

    🛡️ pi-qwen-guard: Qwen3.6 incremental mode enabled

On every before_agent_start (i.e. before each agent turn):

  • When the flag is set, appends a block of strict incremental-mode instructions to the system prompt.

The injected rules (abridged):

CRITICAL QWEN3.6 / OLLAMA INCREMENTAL MODE (enforced every turn):

  • Never output more than ~70–80 lines of code in any single response.
  • Prefer the edit tool over write for any file that already exists.
  • Work in small logical chunks.
  • After completing a chunk, emit a progress signal that starts with exactly: 🛡️ pi-qwen-guard: ✅ Chunk complete. File is now X lines.
  • You may then continue directly to the next chunk (no need to wait for user approval).

This forces the model to stay within Ollama's streaming limits and eliminates the two fatal errors.

The guard is a no-op for all non-Qwen models.

Development

This package lives in the pi-extensions monorepo.

# From the repo root
npm ci
npm run check

To try local changes against a real Pi session:

pi -e ./packages/qwen-guard

License

MIT © Jeremy Combs