@k3_2o/pi-read-image

OCR image-to-text tool for Pi — extracts text from screenshots, terminal output, and code images using Tesseract + ImageMagick

Packages

Package details

extension

Install @k3_2o/pi-read-image from npm and Pi will load the resources declared by the package manifest.

$ pi install npm:@k3_2o/pi-read-image
Package
@k3_2o/pi-read-image
Version
0.2.1
Published
Jun 22, 2026
Downloads
not available
Author
k3_2o
License
MIT
Types
extension
Size
31.3 KB
Dependencies
0 dependencies · 4 peers
Pi manifest JSON
{
  "extensions": [
    "./index.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-read-image

npm

OCR image-to-text tool for Pi — extracts text from screenshots, terminal output, and code images using Tesseract + ImageMagick.

What it does

When Pi encounters an image the model can't see, it OCRs it locally. Preprocesses with ImageMagick (adaptive upscale, sharpen, contrast, grayscale), runs Tesseract with LSTM engine, extracts word-level confidence, and cleans up artifacts. If confidence is low and the model has vision, the image is sent directly to the model instead.

Pass an array of paths to OCR multiple images in one call — they process concurrently and one failure won't waste the batch.

Requirements

  • tesseract-ocr
  • imagemagick
# Debian/Ubuntu
sudo apt install tesseract-ocr imagemagick -y
# macOS
brew install tesseract imagemagick

Install

# via npm (recommended)
pi install npm:@k3_2o/pi-read-image

# via GitHub
pi install git:github.com/k3-2o/pi-read-image

Usage

The model uses read_image automatically when it can't see an image. Parameters:

Parameter Type Default Description
path string | string[] Path to image file, or array of paths for batch OCR
language string "eng" OCR language code (ISO 639-3)
psm number 6 Page Segmentation Mode (3=auto, 4=single column, 6=block of text, 7=single line, 11=sparse, 13=raw)

License

MIT