vision-handoff
Vision handoff extension for pi - send images to a vision-capable model for analysis
Package details
Install vision-handoff from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:vision-handoff- Package
vision-handoff- Version
1.0.0- Published
- May 21, 2026
- Downloads
- not available
- Author
- scavanger2221
- License
- MIT
- Types
- extension
- Size
- 12.6 KB
- Dependencies
- 0 dependencies · 0 peers
Pi manifest JSON
{
"extensions": [
"./extensions/vision-handoff"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
Vision Handoff Extension
A pi extension that sends images to a separate vision-capable model for analysis when the current model cannot see images.
Features
- Analyze images using a configured vision model
- Support for single or multiple image files
- Automatic file reading and base64 conversion
- Works with any vision-capable model (GPT-4V, Claude 3, etc.)
Installation
Install as a pi package:
pi install /home/dwi/Project/vision-handoff-package
Or install from a git repository after publishing.
Setup
Create a vision.json configuration file in your project root or .pi/ directory:
{
"provider": "anthropic",
"model": "claude-3-5-sonnet-20241022",
"apiKey": "your-api-key-here"
}
Or place it in ~/.pi/vision.json for global configuration.
Configuration options:
provider- Provider name (e.g., "anthropic", "openai", "google")model- Model ID that supports vision (must have "image" in input capabilities)baseUrl- Optional custom base URL for the providerapi- Optional API type overrideapiKey- Optional API key (can also use environment variables or auth storage)
Usage
The extension registers a vision_handoff tool that the LLM can use when it needs to analyze images but cannot see them itself.
Tool Parameters
prompt(required): Text prompt/question to send to the vision modelimagePath: Single image file pathimagePaths: Array of image file paths (for multiple images)images: Array of base64-encoded image data (advanced use)
Examples
Single image:
vision_handoff({
prompt: "What's in this image?",
imagePath: "/path/to/image.png"
})
Multiple images:
vision_handoff({
prompt: "Compare these images and describe the differences",
imagePaths: ["/path/to/image1.png", "/path/to/image2.jpg"]
})
With existing base64 data:
vision_handoff({
prompt: "Analyze this image",
images: ["data:image/png;base64,iVBORw0KGgo..."]
})
How It Works
- LLM determines it needs to analyze an image but can't see it
- LLM calls
vision_handofftool with file paths and prompt - Extension reads files, converts to base64
- Extension sends to configured vision model
- Vision model's analysis is returned to the LLM
Supported Image Formats
- PNG (.png)
- JPEG (.jpg, .jpeg)
- GIF (.gif)
- WebP (.webp)
Notes
- The extension will show a notification on session start if a vision model is configured
- If no config is found, the tool returns an error
- The tool checks that the configured model supports image input
- API keys are resolved via the same auth system as pi (environment variables, auth.json, etc.)
License
MIT