pi-flows

First-party pi extension for delegating work to local flow agents.

Packages

Package details

extension

Install pi-flows from npm and Pi will load the resources declared by the package manifest.

npm repo home report

$ pi install npm:pi-flows

Package: pi-flows
Version: 0.1.0
Published: Jun 6, 2026
Downloads: not available
Author: thulr
License: MIT
Types: extension
Size: 207 KB
Dependencies: 0 dependencies · 5 peers

Pi manifest JSON

{
  "extensions": [
    "./extensions/pi-flows/index.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-flows

Delegate work to isolated sub-agents from inside pi — with proven multi-agent patterns, safety, cost limits, and tracing built in.

pi-flows adds a single flow tool to the pi coding agent. It runs your task in separate, disposable pi subprocesses — from a single specialist to a parallel fan-out, a generate-and-critique loop, or a full decompose-and-synthesize — so heavy exploration and verification happen in clean contexts instead of bloating your main session.

Why use it

Keep your main context clean. Sub-agents explore, build, and review in their own subprocess and hand back a compact result — not a wall of tool output.
Proven patterns, one tool. single, parallel, chain, evaluate, vote, route, and orchestrate — each a named agent-design pattern (from Anthropic's Building Effective Agents, Andrew Ng, and Google's ADK), not an ad-hoc prompt. See Patterns.
Safe by default. Repo-controlled agent prompts fail closed in headless runs, secrets and home paths are redacted, inter-agent handoffs are scanned for prompt injection, concurrent write agents cannot share one checkout unless you opt in, and read-only agents (recon, analyst) ship with no shell. See Safety model.
Bounded — including cost. Every run is capped on count, time, and nesting depth; maxCostUsd / maxTokens cap total spend across the whole flow tree.
Inspectable. Structured errors that name the fix, an offline test suite, and OpenInference-shaped trace export you can read with jq or any OpenTelemetry backend.

What it looks like

Load the extension in pi, then delegate a single read-only task:

Use flow with {"agent":"recon","task":"Find the API routes for billing"}

recon runs in its own subprocess and hands back just the findings. When you need a verified result instead of a single pass, reach for another mode:

{
  "task": "Add a /health endpoint that returns 200 and a JSON status, with a test",
  "evaluate": { "checkCommand": "npm test", "maxIterations": 3 }
}

The operator builds the change, a separate redteam critic judges the result, and npm test must exit 0 — the loop revises until both pass or it hits maxIterations. → Quickstart

Install

pi-flows runs inside the pi coding agent, so you install it as a pi package — no clone required.

Prerequisites: Node.js >=24, npm >=11, and the pi CLI >=0.78.0 on your PATH. Don't have pi? It ships in @earendil-works/pi-coding-agent:

npm i -g @earendil-works/pi-coding-agent

Install it with the pi CLI — from npm for the published release, or from GitHub to track main:

# From npm (recommended) — the published release
pi install npm:pi-flows

# Add -l to install into the current project only (.pi/settings.json)
pi install -l npm:pi-flows

# Or track the latest main straight from GitHub, no clone required
pi install git:github.com/Thulr/pi-flows

Reload pi with /reload (or restart it), then verify:

/flows version
Use flow with {"list":true}

Success looks like all nine bundled agents in the flow list output — recon, strategist, overwatch, operator, analyst, redteam, controller, commander, and debrief. If pi isn't found, see Troubleshooting → pi: command not found. → Quickstart

Run from a clone (development)

To hack on pi-flows or try unreleased main, work from a checkout:

git clone https://github.com/Thulr/pi-flows
cd pi-flows
npm ci
npm run preflight   # verify the pi CLI is installed and on PATH
pi -e ./extensions/pi-flows/index.ts   # load the local extension in pi

Inside pi, smoke-test with no model call:

/flows help
/flows status
Use flow with {"list":true}
Use flow with {"showConfig":true}

Or install your working copy as a package with pi install -l ./. See Development for the build/test loop and Contributing.

What it adds

flow tool: runs isolated pi subprocesses for single, parallel, chain, evaluate (generator-evaluator), vote, route, and orchestrate delegation.
/flows command: lists available flow agents and shows help/status/version output.
Bundled agents in agents/: recon, strategist, overwatch, operator, analyst, redteam, controller, commander, and debrief.
User agents: ~/.pi/agent/flow-agents/*.md.
Project agents: .pi/flow-agents/*.md when agentScope: "project" or "all" is used.

Safety model

Project-local agents are repo-controlled prompts. In interactive pi sessions, pi-flows asks before running them. In headless (non-UI) runs, pi-flows fails closed by default and refuses project-local agents unless you explicitly pass confirmProjectAgents:false after reviewing the files.

pi-flows also redacts secret-shaped content and home paths from returned content/details by default. Inter-agent handoffs — where one child's output becomes another child's prompt ({previous} in chain, the evaluate artifact, vote ballots, orchestrate findings) — are an indirect prompt-injection surface, so pi-flows strips invisible/bidi characters and flags instruction-override markers in that content before reuse, surfacing a warning rather than silently trusting it. See Privacy & telemetry.

Cost is bounded as well as count and time: pass maxCostUsd / maxTokens to cap cumulative spend across the whole flow tree (BUDGET_EXCEEDED once reached). Concurrent fan-out also refuses multiple write-capable agents in the same cwd (SHARED_WRITE_CWD) unless allowSharedWriteCwd:true is explicit. Read-only agents (recon, analyst) ship without a shell, so their read-only boundary is enforced by the toolset, not by prompt instructions alone.

`flow` tool quick reference

List

{ "list": true }

Show effective config

{ "showConfig": true }

Single

{ "agent": "recon", "task": "Find the API routes for billing" }

Parallel

{
  "tasks": [
    { "agent": "recon", "task": "Find frontend auth code" },
    { "agent": "recon", "task": "Find backend auth code" }
  ],
  "concurrency": 2
}

Defaults: concurrency=4 (per-call). maxParallelTasks is a fixed hard cap of 8, not a per-call input.

Chain

{
  "task": "Add Redis caching to the session store",
  "chain": [
    { "agent": "recon", "task": "Research this task: {task}" },
    { "agent": "strategist", "task": "Plan using this context:\n\n{previous}" }
  ]
}

Chain {previous} handoffs are capped, redacted, and scanned for injection before they become the next prompt.

Evaluate (generator-evaluator loop)

{
  "task": "Add a /health endpoint that returns 200 and a JSON status, with a test",
  "evaluate": {
    "operator": { "agent": "operator" },
    "redteam": { "agent": "redteam" },
    "checkCommand": "npm test",
    "maxIterations": 3
  }
}

The operator builds against task; a separate redteam judges the artifact (not the builder's trace) and returns VERDICT: PASS or VERDICT: REVISE with critique. On REVISE the operator is re-shown its prior artifact plus the critique and revises in place. The loop revises until it passes or hits maxIterations (default 3, cap 8).

Two optional reliability levers: checkCommand is a deterministic gate (a shell command that must exit 0 — level-1 code assertions alongside the LLM critic; non-zero is an automatic REVISE), and redteam may be an array of critics (a decomposed panel — e.g. one per dimension; PASS requires all of them). See Flow reference.

Vote (parallelization / voting)

{
  "task": "Is /^(a+)+$/ vulnerable to catastrophic backtracking?",
  "vote": { "voters": [{ "agent": "recon" }, { "agent": "overwatch" }], "debrief": { "agent": "debrief" } }
}

Runs the same task across ≥2 voters (use different models to break correlated errors) and synthesizes one answer via the optional debrief aggregator. Without it, all answers are returned.

Route (classify → dispatch)

{ "task": "The billing webhook returns 500s in prod", "route": { "candidates": ["recon", "strategist", "overwatch"], "fallback": "recon" } }

The controller picks one candidate (ROUTE: <agent>) and runs it — or emits ROUTE: none when nothing fits, falling back instead of forcing a guess.

Orchestrate (decompose → fan out → synthesize)

{
  "task": "Document how auth works across the codebase",
  "returnContract": "Return sections for login, token refresh, session storage, and gaps.",
  "requireEvidence": true,
  "orchestrate": {
    "recon": { "agent": "recon" },
    "verify": { "agent": "overwatch" },
    "verifyPolicy": "revise",
    "maxSubtasks": 5
  }
}

The commander splits the task into a JSON list of subtasks, recon workers run them in parallel, and the debrief agent merges the findings. An optional verify critic checks the merged answer against the goal in the same call. verifyPolicy:"note" keeps the verdict advisory, "fail" hard-fails on REVISE, and "revise" reruns debrief with the critique until pass or verifyMaxIterations.

Cost budget and tracing

Any mode accepts a cumulative spend ceiling and a trace sink:

{ "task": "...", "orchestrate": {}, "maxCostUsd": 0.50, "traceFile": "flow-trace.jsonl", "traceLabel": "release-gate" }

maxCostUsd / maxTokens cap total spend across the whole flow tree (BUDGET_EXCEEDED once reached). traceFile (or PI_FLOWS_TRACE_FILE) appends one OpenInference-shaped JSON span per child plus a root span — JSONL any OpenTelemetry backend, or a coding agent, can read. Summarize local traces with /flows report flow-trace.jsonl or npm run trace:report -- flow-trace.jsonl from a checkout.

Agent definition format

Create markdown files with YAML frontmatter:

---
name: my-agent
description: What this agent does
tools: read,grep,find,ls
tier: capable
---

System prompt for the delegated agent.

tier keeps agents portable — no vendor model is hard-coded. capable runs on your pi default model; fast runs on PI_FLOWS_FAST_MODEL if you set one (e.g. a cheaper model for your provider, like openai-codex/gpt-5.4-mini), otherwise your default too. So flows use whatever model you have pi set up with, and the extension never needs updating as providers ship new models. Pin an explicit model: to override the tier (a flow-call model overrides too). tools: none disables built-in tools. Omitting tools uses pi defaults. Invalid agent files are reported in /flows status and flow showConfig:true.

Documentation ladder

Development

npm ci
npm run check

Useful individual checks:

npm run typecheck
npm test
npm run validate:agents
npm run pack:dry-run

pi-flows

Why use it

What it looks like

Install

Run from a clone (development)

What it adds

Safety model

flow tool quick reference

List

Show effective config

Single

Parallel

Chain

Evaluate (generator-evaluator loop)

Vote (parallelization / voting)

Route (classify → dispatch)

Orchestrate (decompose → fan out → synthesize)

Cost budget and tracing

Agent definition format

Documentation ladder

Development

`flow` tool quick reference