vision-handoff

Vision handoff extension for pi - send images to a vision-capable model for analysis

Packages

Package details

extension

Install vision-handoff from npm and Pi will load the resources declared by the package manifest.

$ pi install npm:vision-handoff
Package
vision-handoff
Version
1.0.0
Published
May 21, 2026
Downloads
not available
Author
scavanger2221
License
MIT
Types
extension
Size
12.6 KB
Dependencies
0 dependencies · 0 peers
Pi manifest JSON
{
  "extensions": [
    "./extensions/vision-handoff"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

Vision Handoff Extension

A pi extension that sends images to a separate vision-capable model for analysis when the current model cannot see images.

Features

  • Analyze images using a configured vision model
  • Support for single or multiple image files
  • Automatic file reading and base64 conversion
  • Works with any vision-capable model (GPT-4V, Claude 3, etc.)

Installation

Install as a pi package:

pi install /home/dwi/Project/vision-handoff-package

Or install from a git repository after publishing.

Setup

Create a vision.json configuration file in your project root or .pi/ directory:

{
  "provider": "anthropic",
  "model": "claude-3-5-sonnet-20241022",
  "apiKey": "your-api-key-here"
}

Or place it in ~/.pi/vision.json for global configuration.

Configuration options:

  • provider - Provider name (e.g., "anthropic", "openai", "google")
  • model - Model ID that supports vision (must have "image" in input capabilities)
  • baseUrl - Optional custom base URL for the provider
  • api - Optional API type override
  • apiKey - Optional API key (can also use environment variables or auth storage)

Usage

The extension registers a vision_handoff tool that the LLM can use when it needs to analyze images but cannot see them itself.

Tool Parameters

  • prompt (required): Text prompt/question to send to the vision model
  • imagePath: Single image file path
  • imagePaths: Array of image file paths (for multiple images)
  • images: Array of base64-encoded image data (advanced use)

Examples

Single image:

vision_handoff({
  prompt: "What's in this image?",
  imagePath: "/path/to/image.png"
})

Multiple images:

vision_handoff({
  prompt: "Compare these images and describe the differences",
  imagePaths: ["/path/to/image1.png", "/path/to/image2.jpg"]
})

With existing base64 data:

vision_handoff({
  prompt: "Analyze this image",
  images: ["data:image/png;base64,iVBORw0KGgo..."]
})

How It Works

  1. LLM determines it needs to analyze an image but can't see it
  2. LLM calls vision_handoff tool with file paths and prompt
  3. Extension reads files, converts to base64
  4. Extension sends to configured vision model
  5. Vision model's analysis is returned to the LLM

Supported Image Formats

  • PNG (.png)
  • JPEG (.jpg, .jpeg)
  • GIF (.gif)
  • WebP (.webp)

Notes

  • The extension will show a notification on session start if a vision model is configured
  • If no config is found, the tool returns an error
  • The tool checks that the configured model supports image input
  • API keys are resolved via the same auth system as pi (environment variables, auth.json, etc.)

License

MIT