pi-whisper-voice

Minimal hold-SPACE voice input for Pi using an OpenAI-compatible Whisper/STT endpoint.

Package details

extension

Install pi-whisper-voice from npm and Pi will load the resources declared by the package manifest.

$ pi install npm:pi-whisper-voice
Package
pi-whisper-voice
Version
0.2.0
Published
Apr 27, 2026
Downloads
269/mo ยท 269/wk
Author
kengbailey
License
MIT
Types
extension
Size
59.9 KB
Dependencies
0 dependencies ยท 2 peers
Pi manifest JSON
{
  "extensions": [
    "./index.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-whisper-voice

Minimal hold-SPACE voice input for Pi using an OpenAI-compatible Whisper/STT endpoint.

Hold SPACE to record, release to transcribe, and the transcript is inserted into Pi's editor for review. It does not auto-send the message; edit the text and submit manually when ready.

Features

  • Hold SPACE push-to-talk inside Pi
  • Local microphone capture via ffmpeg
  • OpenAI-compatible STT endpoint: POST /v1/audio/transcriptions
  • In-TUI settings for server URL, model, and token
  • Transcript inserted into the editor for review/editing
  • Persistent footer state: ๐ŸŽค ready, ๐ŸŽค recording, ๐ŸŽค transcribing
  • No cloud-provider lock-in
  • No fallback shortcut or global daemon

Usage

Start Pi. If the terminal supports Kitty keyboard protocol, the footer should show:

๐ŸŽค ready

Then:

  1. Hold SPACE until recording starts.
  2. Speak.
  3. Release SPACE.
  4. Wait for ๐ŸŽค transcribing to finish.
  5. Review/edit the transcript inserted in the editor.
  6. Send manually when ready.

Toggle voice input:

/voice

Configure the STT server URL, model name, and token:

/voice-settings

Alias:

/voice settings

Show the active configuration:

/voice status

Settings are saved under piWhisperVoice in global Pi settings JSON (~/.pi/agent/settings.json). Environment variables can override saved values:

PI_VOICE_STT_BASE_URL
PI_VOICE_STT_MODEL
PI_VOICE_STT_TOKEN

Project-local voice settings are ignored for safety so a repository cannot redirect microphone audio or supply a token.

Current requirements

  • Pi coding agent
  • A terminal/session with Kitty keyboard protocol key-release support
  • ffmpeg installed and microphone permission granted
  • An OpenAI-compatible transcription server

Example STT endpoint shape:

POST http://localhost:8000/v1/audio/transcriptions
Authorization: Bearer dummy
Content-Type: multipart/form-data

Response:

{ "text": "transcribed text" }

Install

Install from npm:

pi install npm:pi-whisper-voice

Or test without installing:

pi -e npm:pi-whisper-voice

Install from GitHub:

pi install git:github.com/kengbailey/pi-whisper-voice

Local development install

This repository can also be loaded directly from disk:

pi -e /path/to/pi-whisper-voice

For global auto-discovery during local development, place it at:

~/.pi/agent/extensions/pi-whisper-voice/

License

MIT