pi-extension-stt
Pi extension package that adds local microphone speech-to-text via faster-whisper.
Package details
Install pi-extension-stt from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-extension-stt- Package
pi-extension-stt- Version
0.1.0-beta.1- Published
- Mar 28, 2026
- Downloads
- 37/mo · 10/wk
- Author
- zerone0x
- License
- MIT
- Types
- extension
- Size
- 118.9 KB
- Dependencies
- 0 dependencies · 1 peer
Pi manifest JSON
{
"extensions": [
"./dist/extension/index.js"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pi-extension-stt
Local-first, privacy-first speech-to-text for Pi.
This package adds a small set of slash commands that let you capture microphone
audio, transcribe it locally with faster-whisper, and insert the final text
into Pi's input editor.
This is intentionally not a desktop dictation app. v1 stays inside Pi:
voice -> text -> insert into Pi editor
The npm package installs the Pi extension itself. It does not install your
local Python runtime, ffmpeg, PortAudio, Python modules, or microphone
permissions for you.
The current beta path is optimized for macOS, especially Homebrew + Python venv.
Status
What works today:
- local microphone capture through a Python bridge
- local transcription through
faster-whisper - Pi slash commands for start, stop, cancel, and status
- transcript insertion into Pi's input editor
- preflight checks for Python,
ffmpeg, and Python module availability
What is intentionally out of scope in v1:
- TTS
- cloud STT providers
- system-wide paste into other apps
- global hotkeys
- desktop UI outside Pi
- multi-backend support
- automatic message sending after transcription
Beta Quick Start
This is the shortest path for a real user on macOS with Homebrew available:
- Install the Pi extension:
pi install npm:pi-extension-stt
- Launch Pi:
pi
- Inside Pi, run:
/stt-bootstrap full
This creates or refreshes a dedicated bridge virtualenv for you, installs
ffmpeg and portaudio via Homebrew, and then prints a relaunch command with
PI_STT_PYTHON=....
If ffmpeg and portaudio are already available, /stt-bootstrap without
full is enough.
- Relaunch Pi with the generated command. A typical result looks like this:
PI_STT_PYTHON=$HOME/.venvs/pi-stt/bin/python \
PI_STT_MODEL=tiny \
pi
- Inside the relaunched Pi session, run:
/stt-setup
- After setup succeeds, use:
Ctrl+Alt+M
If you prefer not to prepare the model during setup, choose the quick path. If you want the smoothest first recording, choose the full path.
Requirements
You need these local dependencies:
python3ffmpeg- Python packages from
src/bridge/requirements.txt
On macOS, a typical setup is:
brew install ffmpeg portaudio
python3 -m pip install \
"faster-whisper>=1.0.0" \
"numpy>=1.26.0" \
"sounddevice>=0.4.7" \
"socksio>=1.0.0"
If you prefer an isolated Python environment for the bridge:
python3 -m venv .venv
. .venv/bin/activate
python -m pip install -r src/bridge/requirements.txt
Install
For npm users, the normal extension install flow is:
pi install npm:pi-extension-stt
That only installs the Pi package. You still need the local dependencies from
the sections above. On macOS beta setups, the normal first-run path is
/stt-bootstrap, then a Pi relaunch, then /stt-setup.
From the package directory:
npm install
npm run build
pi install .
Or load it directly while developing:
npm run build
pi -e ./dist/extension/index.js
Commands
After the extension is loaded, these commands are available:
/stt-toggle/stt-bootstrap/stt-launch-command/stt-setup/stt-start/stt-stop/stt-cancel/stt-status/stt-prepare/stt-devices/stt-device
Recommended First Run
For the smoothest setup, use this order:
- Run
/stt-bootstrap. - Relaunch Pi with the generated
PI_STT_PYTHON=... picommand. - Run
/stt-setup. - Pick
Quick checkorFull setup. - If needed, run
/stt-deviceafterward to pin a specific microphone. - Press
Ctrl+Alt+Mor run/stt-toggleto start, speak, then press it again to stop.
Shortcut
Ctrl+Alt+M- idle or error: start STT
- starting: cancel startup
- listening: stop and insert transcript
If you prefer commands, /stt-toggle follows the same behavior.
/stt-start
Runs preflight checks, starts the local bridge, and begins listening on the default microphone.
/stt-stop
Stops listening, finalizes queued transcription, and inserts the transcript into Pi's current input editor. It does not auto-send the message.
/stt-cancel
Stops listening and discards the current transcript buffer.
/stt-status
Shows the current STT state, model, language, selected device, and transcript summary.
/stt-toggle
Single-command STT toggle. It starts recording when idle, stops and inserts the transcript when listening, and cancels startup while the bridge is still preparing.
/stt-setup
Guided first-run setup. This is the normal user-facing entry point.
quick- checks Python,
ffmpeg, microphones, and the selected input path - skips model download
- checks Python,
full- does the same checks
- also prepares the local model cache
device- opens the microphone picker directly
Examples:
/stt-setup
/stt-setup quick
/stt-setup full
/stt-setup device
/stt-bootstrap
Guided local bootstrap for beta users on macOS.
python- creates or refreshes a dedicated virtualenv
- installs bridge Python dependencies from
src/bridge/requirements.txt - leaves Homebrew packages unchanged
full- does the same Python bootstrap
- also runs
brew install ffmpeg portaudio
Examples:
/stt-bootstrap
/stt-bootstrap python
/stt-bootstrap full
/stt-bootstrap ~/.venvs/pi-stt
Notes:
- this command does not mutate the current Pi process environment in-place
fullassumes Homebrew is already installed- if the generated Python path differs from the current
PI_STT_PYTHON, you need to relaunch Pi with the printed command - after relaunch, run
/stt-setup
/stt-launch-command
Shows the most recent relaunch command generated by /stt-bootstrap.
Use this when you missed the original success notification but still need the
exact PI_STT_PYTHON=... pi command.
/stt-prepare
Pre-downloads and validates the configured faster-whisper model before you
start recording. This is the best way to avoid a long first /stt-start.
/stt-devices
Lists the currently detected input devices and marks the default microphone.
/stt-device
Opens an interactive picker for the active STT input device. You can also pass an explicit device id or partial device name. The extension verifies that the selected device can actually be opened before saving it:
/stt-device 0
/stt-device MacBook
/stt-device clear
Environment Variables
Minimal configuration is done through environment variables:
PI_STT_PYTHONPI_STT_BOOTSTRAP_VENVPI_STT_MODELPI_STT_LANGUAGEPI_STT_DEVICEPI_STT_SILENCE_MSPI_STT_BLOCK_MSPI_STT_MIN_SPEECH_MSPI_STT_PRE_ROLL_MSPI_STT_STOP_GRACE_MSPI_STT_SPEECH_THRESHOLDPI_STT_PREPARE_TIMEOUT_MSPI_STT_START_TIMEOUT_MS
Defaults:
- Python:
python3 - model:
base - language: auto-detect
- silence window:
1200ms - block size:
200ms - minimum speech window:
200ms - pre-roll:
300ms - stop grace window:
300ms - speech threshold:
0.006 - prepare timeout:
300000ms - startup timeout:
30000ms
Notes:
PI_STT_MODELcan be a faster-whisper model name liketiny,base, or a local model directory pathPI_STT_PYTHONis useful when the bridge runs from a project-local virtualenv, for examplePI_STT_PYTHON=/path/to/pi-extension-stt/.venv/bin/pythonPI_STT_BOOTSTRAP_VENVchanges the default target path used by/stt-bootstrap- the recommended beta path is a dedicated virtualenv, for example
PI_STT_PYTHON=$HOME/.venvs/pi-stt/bin/python - if direct access to Hugging Face is blocked, you can set
HF_ENDPOINT=https://hf-mirror.combefore running Pi - on proxy-heavy setups, model downloads may work better after unsetting
http_proxy,https_proxy, andall_proxy - if STT hears the microphone but still misses speech, try a slightly lower gate,
for example
PI_STT_SPEECH_THRESHOLD=0.004
Failure Modes
Common failures and how they surface:
- missing
python3:/stt-startreports that Python is unavailable - missing
ffmpeg:/stt-startreports thatffmpegis unavailable - missing Python modules:
/stt-startpoints you topip install -r src/bridge/requirements.txt - first-time setup confusion:
/stt-setupnow gives a guided path and clear next steps - first-time local environment setup:
/stt-bootstrapcan create the bridge virtualenv and print the exact relaunch command for Pi - missing Homebrew in
/stt-bootstrap full: install Homebrew first, or use/stt-bootstrapplus manual system dependencies - slow or blocked model download:
/stt-prepareor/stt-starttells you to tryHF_ENDPOINT=https://hf-mirror.comor a local model path - microphone permission or device problems: the bridge reports a startup error
- bad device choice:
/stt-devicesshows available inputs and/stt-device clearresets back to the system default microphone - quiet speech or quick stop: the bridge now shows the observed mic level and
suggested next steps; if needed, lower
PI_STT_SPEECH_THRESHOLDor wait a fraction longer before stopping
UX Notes
/stt-bootstrapis the normal first-run automation path on macOS beta setups./stt-setupis the guided diagnostic and model-prep path after bootstrap./stt-launch-commandreplays the exact relaunch command if you missed the original bootstrap notification.- While STT is starting or listening, the extension adds a small widget above the Pi editor with current state and next-step hints.
- The widget also stays visible while idle, so you always have a compact “start dictation” affordance in-session, plus bootstrap/setup guidance if the local environment is still incomplete.
- While listening, the widget now shows a live mic/gate hint so you can tell whether the bridge is hearing enough signal to start a segment.
/stt-stopinserts the final transcript into the current Pi editor buffer and never auto-sends the message./stt-cancelcan abort both active listening and long startup/model-prepare steps.
Development
Type-check and build:
npm run check
npm run build
Run the lightweight tests:
npm test