pi-vox
Lightweight voice dictation for Pi: /voice-toggle records locally and transcribes with ElevenLabs.
Package details
Install pi-vox from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-vox- Package
pi-vox- Version
0.1.0- Published
- Jun 11, 2026
- Downloads
- not available
- Author
- denismrvoljak
- License
- MIT
- Types
- extension
- Size
- 36.3 KB
- Dependencies
- 0 dependencies · 1 peer
Pi manifest JSON
{
"extensions": [
"./extensions"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pi-vox
Voice input for pi coding agent.
It records your microphone, sends the audio to ElevenLabs speech-to-text, and puts the transcript into the current pi input box.
ElevenLabs is the only speech provider right now. Their free tier is generous enough for normal testing and light use.
Install
From npm:
pi install npm:pi-vox
Or from GitHub:
pi install https://github.com/denismrvoljak/pi-vox
Setup
1. Add your ElevenLabs API key
Create an API key in ElevenLabs, then set it before starting pi:
export ELEVENLABS_API_KEY="your-key-here"
You can also put it in a .env file in the directory where you launch pi:
ELEVENLABS_API_KEY=your-key-here
pi-vox redacts this key from status messages and common error output.
2. Install a recorder
pi-vox needs a local command-line recorder. On macOS, install one of these:
brew install sox
or:
brew install ffmpeg
If you install sox, pi-vox can use rec or sox. If you install ffmpeg, it can use ffmpeg.
3. Reload pi
Inside pi:
/reload
Then check that everything is connected:
/voice-status
You should see something like:
Voice input: version=..., provider=elevenlabs, key=configured, autoSubmit=off, cleanup=on, audio=rec/sox/ffmpeg
How to use it
Start recording:
/voice-toggle
Speak your prompt.
Stop recording and insert the transcript:
/voice-toggle
Cancel the recording:
/voice-cancel
That's the main workflow.
Commands
/voice-toggle
Starts recording when idle. Stops recording when active, transcribes, and inserts the text into the editor.
/voice-cancel
Stops the current recording and deletes the temporary audio file.
/voice-status
Shows whether the API key is configured, which recorder is available, and a few current settings.
/voice-glossary
Adds custom cleanup rules for words speech-to-text gets wrong.
/voice-glossary list
/voice-glossary add pi-vox pyvox "bye vox"
/voice-glossary add pi-coding-agent pycodingagent "bye coding agent"
/voice-glossary clear
Settings are saved here:
~/.pi/pi-vox/config.json
You can use another config file with:
export PI_VOX_CONFIG=/path/to/config.json
Transcript cleanup
Speech-to-text often gets project names wrong, so pi-vox cleans up common mistakes before inserting the text.
Examples:
py-coding agent→pi-coding-agentpie coding agent→pi-coding-agentbye coding agent→pi-coding-agentpyvox→pi-voxpytutor→pi-tutorpyoverwatch→pi-overwatch
You can add your own glossary entries in config:
{
"transcriptGlossary": [
{ "canonical": "my-product", "aliases": ["my product", "mai product"] }
]
}
Or use the command:
/voice-glossary add my-product "my product" "mai product"
To turn cleanup off:
{
"transcriptCleanup": false
}
Why it uses commands instead of hold-space
Some terminals handle key press/release events differently. Holding space can be unreliable, and it can interfere with normal typing.
So the default is simple and safe:
/voice-toggle
There is internal support for shortcuts and hold-to-talk, but the command workflow is the supported default.
Config defaults
{
provider: 'elevenlabs',
holdKey: 'space',
holdToTalk: false,
holdThresholdMs: 350,
fallbackToggleShortcut: 'ctrl+v',
cancelShortcut: 'escape',
autoSubmit: false,
appendMode: 'append',
recorder: 'auto',
transcriptCleanup: true,
transcriptGlossary: undefined,
transcriptReplacements: undefined
}
Privacy notes
When you stop recording, pi-vox sends that audio to ElevenLabs for transcription.
It does not keep a recording history. Temporary audio files are cleaned up after transcribe or cancel.
Still, don't dictate secrets into any networked voice tool.
Development
pnpm install
pnpm test
pnpm check
pnpm pack:smoke
Local install while developing:
pi install /absolute/path/to/pi-vox
Inside pi:
/reload
/voice-status
/voice-toggle
Known limitations
- ElevenLabs is the only provider right now
- command-based toggle is the supported path
- hold-space is disabled by default
- no streaming partial transcripts yet
- no text-to-speech, wake word, or daemon
License
MIT