@lokiyou/modelscope-vision

Simple image understanding for Pi Coding Agent. Install it, set your API key, and ask normal questions about images.

Packages

Package details

extensionskill

Install @lokiyou/modelscope-vision from npm and Pi will load the resources declared by the package manifest.

$ pi install npm:@lokiyou/modelscope-vision
Package
@lokiyou/modelscope-vision
Version
0.1.3
Published
Jun 4, 2026
Downloads
not available
Author
lokiyou
License
MIT
Types
extension, skill
Size
15.2 KB
Dependencies
0 dependencies · 2 peers
Pi manifest JSON
{
  "extensions": [
    "./extensions/index.ts"
  ],
  "skills": [
    "./skills"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

@lokiyou/modelscope-vision

Simple image understanding for Pi Coding Agent.

Install it, set your ModelScope API key, choose a model, and ask normal questions about images. The extension is designed to feel straightforward for everyday use.

What it does

It adds image understanding to Pi.

After installation, you can ask Pi to:

  • describe an image
  • answer questions about an image
  • read visible text in an image
  • inspect either a public image URL or a local image file

Most users do not need to call the tools manually.

Installation

pi install npm:@lokiyou/modelscope-vision
/reload

Configuration

The extension stores its configuration at:

~/.pi/agent/extensions/modelscope-vision/config.json

In most cases, you only need to do two things:

  1. set your API key
  2. choose the model you want to use

Set the API key

/modelscope-vision key

Enter your ModelScope access token when prompted.

Set the model

/modelscope-vision model Qwen/Qwen3-VL-32B-Instruct

Optional: set a custom base URL

/modelscope-vision base-url https://api-inference.modelscope.cn/v1

View the current configuration

/modelscope-vision config

After changing configuration, run /reload if needed.

How to use it

After installation, use normal prompts such as:

  • "Describe this image in detail."
  • "How many people are in this photo?"
  • "What does the text in this image say?"
  • "What error message is shown in this screenshot?"

The extension supports either of the following inputs:

  • image_url for a public image URL
  • image_path for an absolute local file path

License

MIT