pi-inspect-image

Pi extension that registers an inspect_image tool — analyzes local image files using a configurable vision-capable model via OpenAI-compatible API

Packages

Package details

extension

Install pi-inspect-image from npm and Pi will load the resources declared by the package manifest.

npm repo home report

$ pi install npm:pi-inspect-image

Package: pi-inspect-image
Version: 1.0.1
Published: May 27, 2026
Downloads: not available
Author: tanjeeschuan
License: MIT
Types: extension
Size: 16.2 KB
Dependencies: 0 dependencies · 2 peers

Pi manifest JSON

{
  "extensions": [
    "./extensions"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-inspect-image

A pi extension that analyzes images using a separate vision-capable model — independent of your main chat model. This enables vision capabilities for non-vision models (e.g. Haiku, Sonnet) via tool calls to a selected vision model.

When to use it:

Your primary chat model doesn't support vision — route image analysis to a dedicated model.
You want to keep large image payloads out of your main conversation context — only the derived description enters your chat.
You need specialized image tasks: describing screenshots, extracting text via OCR, analyzing UI mockups, reading diagrams or charts, debugging visual layout issues, or inspecting error screenshots.
You're working with visual artifacts (designs, photos, graphs) and want pi to understand them before generating code.

Supported providers: OpenAI, OpenRouter, and any OpenAI-compatible API.

Setup

Run /setup-vision for a 2-step interactive configuration:

/setup-vision

Pick a provider — auto-populated from providers you've authenticated via /login that have vision models.
Enter the model ID — e.g. gpt-4o, openai/gpt-4o, or any model available on your provider.

The extension validates your entry against the model registry and warns if the model isn't recognized or lacks vision support (both are advisory — the call will still be attempted). Configuration is saved to .pi/settings.json.

You can also configure manually:

{
  "visionConfig": {
    "provider": "openai",
    "model": "gpt-4o"
  }
}

Project settings (.pi/settings.json) — shared with your team, set via /setup-vision.
Global settings (~/.pi/agent/settings.json) — personal, edit manually.

API keys

Configure via /login or set the appropriate environment variable:

Provider	Env var
OpenAI	`OPENAI_API_KEY`
OpenRouter	`OPENROUTER_API_KEY`
Custom	Any — handled by `/login` or `auth.json`

Configuration

Field	Required	Description
`provider`	Yes	`"openai"`, `"openrouter"`, or any custom provider name
`model`	Yes	Model ID (e.g. `"gpt-4o"`, `"openai/gpt-4o"`)
`baseUrl`	No	Custom API base URL for compatible providers (auto-resolved for OpenAI/OpenRouter)
`maxTokens`	No	Max tokens in response (default: 4096)

Supported Image Formats

PNG, JPEG, GIF, WebP, BMP — up to 20 MB.

Usage

The extension registers an inspect_image tool. Once configured, ask pi to describe an image and it will use this tool automatically.

You can also pass a custom prompt to guide the analysis:

inspect_image("screenshot.png", "Extract all text visible in this image")

If no vision model is configured when the tool runs, you'll be guided through setup on the spot.

License

MIT