@sincspecv/pi-chutes

pi extension that adds chutes.ai as a model provider

Package details

extension

Install @sincspecv/pi-chutes from npm and Pi will load the resources declared by the package manifest.

$ pi install npm:@sincspecv/pi-chutes
Package
@sincspecv/pi-chutes
Version
1.0.0
Published
Apr 11, 2026
Downloads
296/mo · 10/wk
Author
sincspecv
License
MIT
Types
extension
Size
40.3 KB
Dependencies
0 dependencies · 1 peer
Pi manifest JSON
{
  "extensions": [
    "./src/index.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

@sincspecv/pi-chutes

A pi extension that adds chutes.ai as a model provider.

Summary

This extension:

  • registers a chutes-ai provider in pi
  • targets the Chutes hosted LLM endpoint: https://llm.chutes.ai/v1
  • authenticates with either /login chutes-ai or CHUTES_API_KEY
  • uses pi's openai-completions provider mode
  • ships with a bundled fallback model catalog
  • loads a cached model catalog from ~/.pi/agent/extensions/pi-chutes/models.json when available
  • refreshes the live model catalog from https://llm.chutes.ai/v1/models on session start
  • provides /chutes-refresh-models for manual refreshes with change summaries

Authentication pattern

This extension now supports both of pi's common auth flows for provider integrations:

  • interactive login via /login chutes-ai — pi persists credentials in ~/.pi/agent/auth.json
  • environment variable fallback via CHUTES_API_KEY

Expected environment variable when using shell-based auth:

export CHUTES_API_KEY=your_chutes_api_key

Optional variables used by the bundled test scripts:

# optional: force a specific model in smoke tests
export CHUTES_MODEL=Qwen/Qwen3-32B-TEE

# optional: override the API base URL used by test scripts
export CHUTES_BASE_URL=https://llm.chutes.ai/v1

# optional: comma-separated model list for reasoning comparison
export CHUTES_COMPARE_MODELS=Qwen/Qwen3-32B-TEE,deepseek-ai/DeepSeek-V3.2-TEE

This keeps secrets out of source code while still allowing users to choose either persisted pi auth or shell-based env auth.

Installation / loading

Option 1: install from npm

pi install npm:@sincspecv/pi-chutes

Then restart pi or run /reload.

Option 2: load directly during development

From the package root:

pi -e ./src/index.ts

Option 3: install into a pi extension location

For regular use, place this package in one of pi's discovered extension locations, then use /reload or restart pi. The copied directory name can stay pi-chutes; it does not need to include the npm scope.

Example project-local install from the package root:

mkdir -p .pi/extensions/pi-chutes
cp -R ./src ./scripts ./package.json ./README.md ./CHANGELOG.md ./LICENSE ./.pi/extensions/pi-chutes/

Option 4: package-style usage

This directory is structured like a small distributable extension package:

  • npm package name: @sincspecv/pi-chutes
  • package entrypoint: ./src/index.ts
  • package.json includes the pi extension entrypoint
  • LICENSE is included
  • .gitignore avoids checking in local secrets and node_modules

Usage

After installing from npm, loading directly, or copying into a pi extension directory:

  1. Authenticate either with /login chutes-ai or by setting CHUTES_API_KEY
  2. Run /model
  3. Find provider chutes-ai
  4. Select one of the available Chutes models

To log out persisted credentials:

/logout chutes-ai

To refresh the live catalog manually:

/chutes-refresh-models

Manual refreshes show a change summary for added, removed, and updated models, and save the refreshed catalog to:

~/.pi/agent/extensions/pi-chutes/models.json

Optional pi-chutes.json

This extension now supports a small non-secret config file named pi-chutes.json.

Supported locations:

  1. the current working directory where pi is launched
  2. the parent directory of the extension source folder

Minimal format:

{
  "baseUrl": "https://llm.chutes.ai/v1",
  "autoRefreshModels": true,
  "recommendedModels": [
    "Qwen/Qwen3-32B-TEE",
    "deepseek-ai/DeepSeek-V3.2-TEE",
    "zai-org/GLM-5-Turbo"
  ],
  "hideNonRecommendedModels": false
}

Field meanings:

  • baseUrl — override the Chutes endpoint used by the provider
  • autoRefreshModels — if true, fetch the live model catalog on session start; if false, use the bundled fallback catalog until manual refresh
  • recommendedModels — model IDs to prioritize in registration order
  • hideNonRecommendedModels — if true, only register models from recommendedModels when that list is non-empty

This file is for extension behavior only. Authentication stays in the environment via CHUTES_API_KEY.

Recommended ways to authenticate

Option 1: /login chutes-ai

Use pi's built-in provider login flow:

/login chutes-ai

This prompts for a Chutes API key and stores it in:

~/.pi/agent/auth.json

Option 2: CHUTES_API_KEY

Any approach that puts the variable into the environment before pi starts is fine. Common options:

Shell profile

export CHUTES_API_KEY=your_chutes_api_key

Add that to .zshrc, .bashrc, or similar if you want it available in every shell.

One-off command

CHUTES_API_KEY=your_chutes_api_key pi -e ./src/index.ts

Sourced .env

If you keep secrets in a local .env file, load it into the shell before starting pi:

set -a
source .env
set +a
pi -e ./src/index.ts

Provider settings

The extension registers:

pi.registerProvider("chutes-ai", {
  baseUrl: config.baseUrl,
  apiKey: "CHUTES_API_KEY",
  authHeader: true,
  api: "openai-completions",
  models: [...]
})

Why this implementation

This implementation is based on confirmed public behavior and source evidence:

  • https://llm.chutes.ai/v1/models returns an OpenAI-style model list
  • invalid bearer auth against the Chutes LLM endpoint returns 401 with {"detail":"Invalid token."}
  • Chutes API code routes the llm hostname to its MegaLLM handler for:
    • POST requests
    • GET /v1/models
  • Chutes example/docs show bearer-authenticated /v1/chat/completions usage

Compatibility notes

This extension currently uses conservative OpenAI-compat defaults on registered models:

  • supportsDeveloperRole: false
  • supportsReasoningEffort: false
  • supportsUsageInStreaming: false
  • maxTokensField: "max_tokens"

For reasoning-capable Chutes models, the extension also sets:

  • thinkingFormat: "qwen-chat-template"

Why:

  • direct probes against multiple Chutes reasoning backends showed that chat_template_kwargs.enable_thinking is the most reliable way to control reasoning output
  • this avoids common failures such as visible <think>...</think> tags or reasoning appearing only in reasoning_content when pi expects normal assistant output
  • it keeps pi's thinking controls aligned with actual Chutes behavior instead of forcing provider-specific prompts or custom post-processing

These defaults are intentionally cautious until more end-to-end request behavior is validated across a wider set of Chutes models.

Project files

  • package.json — package metadata and pi extension entrypoint for the published npm package @sincspecv/pi-chutes
  • src/index.ts — provider registration, config loading, and live-catalog refresh logic
  • src/models.ts — bundled fallback model catalog snapshot
  • scripts/ — smoke-test and reasoning verification helpers
  • LICENSE — MIT license
  • .gitignore — ignores local env files and node_modules

Smoke test

A small smoke-test script is included to verify that your key works and that Chutes accepts a minimal completion request.

Run it from this extension directory:

CHUTES_API_KEY=your_chutes_api_key npm run smoke-test

Optional overrides:

  • CHUTES_MODEL — force a specific model ID instead of using the first returned model
  • CHUTES_BASE_URL — override the default base URL (https://llm.chutes.ai/v1)

What it checks:

  • authenticated GET /models
  • authenticated POST /chat/completions

Streaming smoke test

A second script verifies basic SSE-style streaming behavior:

CHUTES_API_KEY=your_chutes_api_key npm run stream-smoke-test

This checks:

  • authenticated GET /models
  • authenticated streaming POST /chat/completions
  • basic SSE parsing
  • reconstruction of streamed text deltas

Reasoning behavior comparison

A third script compares default Chutes behavior against chat_template_kwargs.enable_thinking: false for several representative models:

CHUTES_API_KEY=your_chutes_api_key npm run reasoning-compare

Optional override:

  • CHUTES_COMPARE_MODELS — comma-separated model IDs to test instead of the built-in comparison set

This is useful for documenting and verifying why the extension uses pi compatibility settings that map thinking control to qwen-chat-template.

Troubleshooting

/model does not show chutes-ai

  • make sure the extension was loaded successfully
  • if using an installed extension location, run /reload
  • if loading directly, restart pi with the -e flag

auth errors from Chutes

  • if you used /login, try /logout chutes-ai and then /login chutes-ai again
  • if you use env auth, confirm CHUTES_API_KEY is set in the environment seen by pi
  • restart pi after changing the environment
  • verify the key is valid on the Chutes side

model list seems outdated

Run:

/chutes-refresh-models

If the live refresh fails, the extension keeps using the cached catalog from ~/.pi/agent/extensions/pi-chutes/models.json or the bundled fallback catalog.

Future improvements

  • validate streaming behavior with a real authenticated completion request
  • relax or tune compat flags if Chutes supports more of the OpenAI surface reliably
  • optionally provide a curated recommended-model subset for easier selection
  • add a small smoke-test script for /v1/models and a real /v1/chat/completions request