@dtmirizzi/pi-openrouter-multimodal

OpenRouter multimodal tools for Pi — search, fetch, image gen, vision, video, PDF, TTS, STT

Packages

Package details

extension

Install @dtmirizzi/pi-openrouter-multimodal from npm and Pi will load the resources declared by the package manifest.

npm repo home report

$ pi install npm:@dtmirizzi/pi-openrouter-multimodal

Package: @dtmirizzi/pi-openrouter-multimodal
Version: 1.6.0
Published: May 30, 2026
Downloads: not available
Author: dtizzal
License: MIT
Types: extension
Size: 69.1 KB
Dependencies: 0 dependencies · 4 peers

Pi manifest JSON

{
  "extensions": [
    "./index.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

@dtmirizzi/pi-openrouter-multimodal

OpenRouter multimodal tool integration for Pi. Provides 8 independently toggleable tools with per-modality model selection and session-persistent settings.

Tool	What it does
`web_search`	Server-side web search with real-time results
`web_fetch`	Fetch page content from a URL (web, docs, PDFs)
`image_generate`	Text-to-image generation via OpenRouter chat completions
`image_understand`	Analyze images via vision models
`video_understand`	Analyze videos (YouTube links work with Gemini)
`pdf_read`	Extract and analyze PDF content
`tts_speak`	Text-to-speech via OpenRouter `/audio/speech` endpoint
`stt_transcribe`	Speech-to-text via OpenRouter `/audio/transcriptions` endpoint

Install

pi install npm:@dtmirizzi/pi-openrouter-multimodal

Or from a local checkout:

pi install /path/to/pi-openrouter-multimodal

API Key

The extension resolves the OpenRouter API key from (priority order):

OPENROUTER_API_KEY environment variable
Pi model registry (provider openrouter)
~/.pi/agent/models.json under providers.openrouter.apiKey

Commands

Command	Description
`/web-tools`	Toggle tools on/off and configure search/fetch engines
`/web-models`	Select models per modality (image, vision, video, PDF, TTS voice, STT)
`/web-search`	Toggle `web_search` and configure search engine
`/web-fetch`	Toggle `web_fetch` and configure fetch engine

Each command opens an interactive overlay. Use ↑↓ to navigate, ←→ to cycle values, Esc to close. Settings persist across sessions and survive compaction, shutdown, and tree navigation.

`/web-tools`

Toggle each tool on/off and set search/fetch engine preferences. Also includes a verbose/compact status-bar display toggle.

`/web-models`

Select the model for each modality from a list fetched live from the OpenRouter API at startup. Falls back to a comprehensive built-in list if the API is unavailable.

Tools

web_search

Parameter	Type	Default	Description
`query`	string	required	Search query
`engine`	string	auto	auto, native, exa, firecrawl, parallel
`max_results`	integer	5	Results per search (1-25)
`search_context_size`	string	—	low (5K), medium (15K), high (30K)
`allowed_domains`	string[]	—	Only return results from these domains
`excluded_domains`	string[]	—	Exclude results from these domains

web_fetch

Parameter	Type	Default	Description
`url`	string	required	URL to fetch content from
`engine`	string	auto	auto, native, exa, openrouter, firecrawl, parallel
`max_content_tokens`	integer	—	Max content length (approximate tokens)

image_generate

Parameter	Type	Default	Description
`prompt`	string	required	Text prompt describing the image
`model`	string	state	Override the default model from `/web-models`

Selected via /web-models. Models are fetched live from OpenRouter at startup; fallback list includes Gemini Flash Image, GPT-5 Image, FLUX.2, Seedream, Riverflow, Recraft, Grok Imagine, and more.

image_understand

Parameter	Type	Default	Description
`url`	string	required	Image URL or base64 data URL
`prompt`	string	Describe this image in detail	Analysis prompt
`model`	string	state	Override default from `/web-models`

video_understand

Parameter	Type	Default	Description
`url`	string	required	Video URL (YouTube links work with Gemini)
`prompt`	string	Describe what happens in this video	Analysis prompt
`model`	string	state	Override default from `/web-models`

pdf_read

Parameter	Type	Default	Description
`url`	string	required	URL of the PDF document
`prompt`	string	Summarize this document	Analysis prompt
`model`	string	state	Override default from `/web-models`
`engine`	string	cloudflare-ai	cloudflare-ai (free), mistral-ocr (scanned docs), or native

tts_speak

Parameter	Type	Default	Description
`text`	string	required	Text to convert to speech
`model`	string	state	Override default from `/web-models`
`voice`	string	state	Override default from `/web-models`

stt_transcribe

Parameter	Type	Default	Description
`audio`	string	required	Base64-encoded audio data
`format`	string	required	wav, mp3, flac, m4a, ogg, webm, aac
`model`	string	state	Override default from `/web-models`
`language`	string	—	ISO-639-1 language code (optional)

How It Works

All tools proxy requests through OpenRouter's API:

web_search / web_fetch — Chat completions with server tool definitions (openrouter:web_search / openrouter:web_fetch)
image_generate — Chat completions with modalities: ["image", "text"] on the selected image generation model
image_understand / video_understand — Chat completions with multimodal content blocks (image_url, video_url)
pdf_read — Chat completions with file content block and file-parser plugin
tts_speak — Direct call to /api/v1/audio/speech
stt_transcribe — Direct call to /api/v1/audio/transcriptions

Model Discovery

On startup, the extension fetches available models from GET /api/v1/models?output_modalities=... and caches them for use in the /web-models settings panel. If the API is unreachable, a comprehensive set of fallback models is used.

Development

# Install dependencies
npm install

# Run tests
npm test               # all tests
npm run test:unit      # unit tests only
npm run test:integration  # requires OPENROUTER_API_KEY

# Format and lint
npm run fmt            # format all files
npm run lint           # lint + fix all files
npm run check          # format + lint + organize imports
npm run check:ci       # strict CI check (format + lint, no writes)

The repo uses Biome for formatting and linting. CI enforces both on every push and PR.

@dtmirizzi/pi-openrouter-multimodal

Package details

Security note

README

@dtmirizzi/pi-openrouter-multimodal

Install

API Key

Commands

`/web-tools`

`/web-models`

Tools

web_search

web_fetch

image_generate

image_understand

video_understand

pdf_read

tts_speak

stt_transcribe

How It Works

Model Discovery

Development

Assets

References