@k3_2o/pi-read-image

OCR image-to-text tool for Pi — extracts text from screenshots, terminal output, and code images using Tesseract + ImageMagick

Packages

Package details

extension

Install @k3_2o/pi-read-image from npm and Pi will load the resources declared by the package manifest.

npm repo home report

$ pi install npm:@k3_2o/pi-read-image

Package: @k3_2o/pi-read-image
Version: 0.2.1
Published: Jun 22, 2026
Downloads: not available
Author: k3_2o
License: MIT
Types: extension
Size: 31.3 KB
Dependencies: 0 dependencies · 4 peers

Pi manifest JSON

{
  "extensions": [
    "./index.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-read-image

OCR image-to-text tool for Pi — extracts text from screenshots, terminal output, and code images using Tesseract + ImageMagick.

What it does

When Pi encounters an image the model can't see, it OCRs it locally. Preprocesses with ImageMagick (adaptive upscale, sharpen, contrast, grayscale), runs Tesseract with LSTM engine, extracts word-level confidence, and cleans up artifacts. If confidence is low and the model has vision, the image is sent directly to the model instead.

Pass an array of paths to OCR multiple images in one call — they process concurrently and one failure won't waste the batch.

Requirements

tesseract-ocr
imagemagick

# Debian/Ubuntu
sudo apt install tesseract-ocr imagemagick -y
# macOS
brew install tesseract imagemagick

Install

# via npm (recommended)
pi install npm:@k3_2o/pi-read-image

# via GitHub
pi install git:github.com/k3-2o/pi-read-image

Usage

The model uses read_image automatically when it can't see an image. Parameters:

Parameter	Type	Default	Description
`path`	`string` \| `string[]`	—	Path to image file, or array of paths for batch OCR
`language`	`string`	`"eng"`	OCR language code (ISO 639-3)
`psm`	`number`	`6`	Page Segmentation Mode (3=auto, 4=single column, 6=block of text, 7=single line, 11=sparse, 13=raw)

License

MIT