pi-whisper-voice
Minimal hold-SPACE voice input for Pi using an OpenAI-compatible Whisper/STT endpoint.
Package details
Install pi-whisper-voice from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-whisper-voice- Package
pi-whisper-voice- Version
0.2.0- Published
- Apr 27, 2026
- Downloads
- 269/mo ยท 269/wk
- Author
- kengbailey
- License
- MIT
- Types
- extension
- Size
- 59.9 KB
- Dependencies
- 0 dependencies ยท 2 peers
Pi manifest JSON
{
"extensions": [
"./index.ts"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pi-whisper-voice
Minimal hold-SPACE voice input for Pi using an OpenAI-compatible Whisper/STT endpoint.
Hold SPACE to record, release to transcribe, and the transcript is inserted into Pi's editor for review. It does not auto-send the message; edit the text and submit manually when ready.
Features
- Hold
SPACEpush-to-talk inside Pi - Local microphone capture via
ffmpeg - OpenAI-compatible STT endpoint:
POST /v1/audio/transcriptions - In-TUI settings for server URL, model, and token
- Transcript inserted into the editor for review/editing
- Persistent footer state:
๐ค ready,๐ค recording,๐ค transcribing - No cloud-provider lock-in
- No fallback shortcut or global daemon
Usage
Start Pi. If the terminal supports Kitty keyboard protocol, the footer should show:
๐ค ready
Then:
- Hold
SPACEuntil recording starts. - Speak.
- Release
SPACE. - Wait for
๐ค transcribingto finish. - Review/edit the transcript inserted in the editor.
- Send manually when ready.
Toggle voice input:
/voice
Configure the STT server URL, model name, and token:
/voice-settings
Alias:
/voice settings
Show the active configuration:
/voice status
Settings are saved under piWhisperVoice in global Pi settings JSON (~/.pi/agent/settings.json). Environment variables can override saved values:
PI_VOICE_STT_BASE_URL
PI_VOICE_STT_MODEL
PI_VOICE_STT_TOKEN
Project-local voice settings are ignored for safety so a repository cannot redirect microphone audio or supply a token.
Current requirements
- Pi coding agent
- A terminal/session with Kitty keyboard protocol key-release support
ffmpeginstalled and microphone permission granted- An OpenAI-compatible transcription server
Example STT endpoint shape:
POST http://localhost:8000/v1/audio/transcriptions
Authorization: Bearer dummy
Content-Type: multipart/form-data
Response:
{ "text": "transcribed text" }
Install
Install from npm:
pi install npm:pi-whisper-voice
Or test without installing:
pi -e npm:pi-whisper-voice
Install from GitHub:
pi install git:github.com/kengbailey/pi-whisper-voice
Local development install
This repository can also be loaded directly from disk:
pi -e /path/to/pi-whisper-voice
For global auto-discovery during local development, place it at:
~/.pi/agent/extensions/pi-whisper-voice/
License
MIT