pi-inspect-image
Pi extension that registers an inspect_image tool — analyzes local image files using a configurable vision-capable model via OpenAI-compatible API
Package details
Install pi-inspect-image from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-inspect-image- Package
pi-inspect-image- Version
1.0.1- Published
- May 27, 2026
- Downloads
- not available
- Author
- tanjeeschuan
- License
- MIT
- Types
- extension
- Size
- 16.2 KB
- Dependencies
- 0 dependencies · 2 peers
Pi manifest JSON
{
"extensions": [
"./extensions"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pi-inspect-image
A pi extension that analyzes images using a separate vision-capable model — independent of your main chat model. This enables vision capabilities for non-vision models (e.g. Haiku, Sonnet) via tool calls to a selected vision model.
When to use it:
- Your primary chat model doesn't support vision — route image analysis to a dedicated model.
- You want to keep large image payloads out of your main conversation context — only the derived description enters your chat.
- You need specialized image tasks: describing screenshots, extracting text via OCR, analyzing UI mockups, reading diagrams or charts, debugging visual layout issues, or inspecting error screenshots.
- You're working with visual artifacts (designs, photos, graphs) and want pi to understand them before generating code.
Supported providers: OpenAI, OpenRouter, and any OpenAI-compatible API.
Setup
Run /setup-vision for a 2-step interactive configuration:
/setup-vision
- Pick a provider — auto-populated from providers you've authenticated via
/loginthat have vision models. - Enter the model ID — e.g.
gpt-4o,openai/gpt-4o, or any model available on your provider.
The extension validates your entry against the model registry and warns if the model isn't recognized or lacks vision support (both are advisory — the call will still be attempted). Configuration is saved to .pi/settings.json.
You can also configure manually:
{
"visionConfig": {
"provider": "openai",
"model": "gpt-4o"
}
}
- Project settings (
.pi/settings.json) — shared with your team, set via/setup-vision. - Global settings (
~/.pi/agent/settings.json) — personal, edit manually.
API keys
Configure via /login or set the appropriate environment variable:
| Provider | Env var |
|---|---|
| OpenAI | OPENAI_API_KEY |
| OpenRouter | OPENROUTER_API_KEY |
| Custom | Any — handled by /login or auth.json |
Configuration
| Field | Required | Description |
|---|---|---|
provider |
Yes | "openai", "openrouter", or any custom provider name |
model |
Yes | Model ID (e.g. "gpt-4o", "openai/gpt-4o") |
baseUrl |
No | Custom API base URL for compatible providers (auto-resolved for OpenAI/OpenRouter) |
maxTokens |
No | Max tokens in response (default: 4096) |
Supported Image Formats
PNG, JPEG, GIF, WebP, BMP — up to 20 MB.
Usage
The extension registers an inspect_image tool. Once configured, ask pi to describe an image and it will use this tool automatically.
You can also pass a custom prompt to guide the analysis:
inspect_image("screenshot.png", "Extract all text visible in this image")
If no vision model is configured when the tool runs, you'll be guided through setup on the spot.
License
MIT