@injaneity/pi-computer-use

Codex-style computer-use tools for Pi on macOS.

Package details

extensionskill

Install @injaneity/pi-computer-use from npm and Pi will load the resources declared by the package manifest.

$ pi install npm:@injaneity/pi-computer-use
Package
@injaneity/pi-computer-use
Version
0.2.2
Published
Apr 28, 2026
Downloads
1,228/mo · 274/wk
Author
injaneity
License
MIT
Types
extension, skill
Size
817.1 KB
Dependencies
0 dependencies · 3 peers
Pi manifest JSON
{
  "extensions": [
    "./extensions"
  ],
  "skills": [
    "./skills"
  ],
  "image": "./assets/img.jpg"
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-computer-use

Codex-style computer use for Pi on macOS.

pi-computer-use gives Pi agents a semantic computer-use surface for visible macOS windows. It prefers Accessibility (AX) targets such as @e1, returns semantic state after every action, and attaches screenshots only when AX coverage is too weak for reliable operation.

Table of Contents

Quick Start

Install the Pi package:

pi install git:github.com/injaneity/pi-computer-use@v0.2.1

Start Pi in interactive mode. On the first session, grant macOS permissions to:

~/.pi/agent/helpers/pi-computer-use/bridge

Required permissions:

  • Accessibility
  • Screen Recording

Some browser automation paths use JavaScript from Apple Events. If the browser blocks that, Pi surfaces a model-readable hint asking the user to enable Allow JavaScript from Apple Events in the browser's developer menu, then retry.

Then call screenshot first in a Pi session. It selects the controlled window and returns the latest semantic state, including AX refs such as @e1 when available. If the target app/window is ambiguous, use list_apps and list_windows first.

list_apps()
list_windows({ app: "Safari" })
screenshot({ window: "@w1" })
click({ window: "@w1", ref: "@e1" })
set_text({ ref: "@e2", text: "hello" })

Use /computer-use in Pi to inspect the effective config and config sources.

What It Adds to Pi

  • Public tools: list_apps, list_windows, screenshot, click, double_click, move_mouse, drag, scroll, keypress, type_text, set_text, wait, arrange_window, navigate_browser, computer_actions.
  • AX target refs in tool results, with capabilities such as canSetValue, canPress, canFocus, canScroll, and adjust.
  • Stable window refs from list_windows, with explicit targeting such as screenshot({ window: "@w1" }) and click({ window: "@w1", ref: "@eN" }).
  • State IDs for stale-action detection.
  • Deterministic window layout through arrange_window presets or explicit frames.
  • Optional screenshot attachment mode with image: "auto" | "always" | "never".
  • Ref-first actions such as click({ ref: "@eN" }), scroll({ ref: "@eN" }), and set_text({ ref: "@eN", text }).
  • Batched actions through computer_actions, with one post-action semantic state update plus per-action execution metadata.
  • Execution metadata that reports stealth for background-safe AX paths and default for focus/raw-event fallbacks.
  • Full pointer and keyboard primitive coverage for common GUI flows, with AX-first equivalents where available.
  • Browser-aware targeting, including isolated browser window preference where appropriate.
  • Optional strict AX mode for background-safe operation without foreground focus, raw pointer events, raw keyboard events, or cursor takeover.
  • Official QA benchmark harness in benchmarks/.

Examples

Prefer AX refs over coordinates when a matching target exists:

click({ ref: "@e1" })
scroll({ ref: "@e3", scrollY: 600 })

Use coordinates from the latest screenshot only when there is no suitable AX target:

click({ x: 320, y: 180, stateId: "..." })

Replace text through AX value semantics:

set_text({ ref: "@e2", text: "https://example.com" })
keypress({ keys: ["Enter"] })

Batch obvious actions when no intermediate inspection is needed:

computer_actions({
  stateId: "...",
  actions: [
    { type: "click", ref: "@e1" },
    { type: "set_text", ref: "@e2", text: "https://example.com" },
    { type: "keypress", keys: ["Enter"] }
  ]
})

See docs/usage.md for the full workflow and tool patterns.

How It Works

pi-computer-use has three pieces:

  1. The Pi extension in extensions/computer-use.ts registers the public tools and /computer-use command.
  2. The TypeScript bridge in src/bridge.ts manages the current window, capture IDs, AX refs, fallback policy, batching, and execution metadata.
  3. The native Swift helper in native/macos/bridge.swift talks to macOS Accessibility, ScreenCaptureKit, AppKit, and CoreGraphics.

The result is semantic-first GUI control: Pi sees useful AX targets first, falls back to screenshots only when needed, and reports whether each action stayed background-safe.

Documentation

  • Usage guide: tool workflow, AX refs, text input, browser flows, batching, and strict AX mode.
  • Configuration: config files, environment overrides, browser control, and stealth mode.
  • Development: local setup, helper builds, validation, release signing notes, and PR workflow.
  • Troubleshooting: permissions, helper setup, stale refs, browser refusal, and strict mode errors.
  • Benchmarks: benchmark commands, metrics, regression policy, and local comparison workflow.
  • Contributing: issue-first contribution rules and PR checklist.

Development & Benchmarks

Install dependencies:

npm install

Run checks:

npm test

Run the local checkout in Pi without loading another installed copy:

pi --no-extensions -e .

Run the default QA benchmark:

npm run benchmark:qa

Run the wider benchmark that may open apps:

npm run benchmark:qa:full

Release & Install Notes

The package is published on npm as @injaneity/pi-computer-use.

npm install @injaneity/pi-computer-use
npm install @injaneity/pi-computer-use@0.2.1

Pi installs should pin a GitHub release tag:

pi install git:github.com/injaneity/pi-computer-use@v0.2.1
pi install -l git:github.com/injaneity/pi-computer-use@v0.2.1
pi install /absolute/path/to/pi-computer-use

Remove:

pi remove git:github.com/injaneity/pi-computer-use@v0.2.1
npm remove @injaneity/pi-computer-use

For a different release, replace v0.2.1 or 0.2.1 with the version you want to pin.

Screenshots

pi-computer-use screenshot

License

MIT

See Also