pi-flows
First-party pi extension for delegating work to local flow agents.
Package details
Install pi-flows from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-flows- Package
pi-flows- Version
0.1.0- Published
- Jun 6, 2026
- Downloads
- not available
- Author
- thulr
- License
- MIT
- Types
- extension
- Size
- 207 KB
- Dependencies
- 0 dependencies · 5 peers
Pi manifest JSON
{
"extensions": [
"./extensions/pi-flows/index.ts"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pi-flows
Delegate work to isolated sub-agents from inside pi — with proven multi-agent patterns, safety, cost limits, and tracing built in.
pi-flows adds a single flow tool to the pi coding agent. It runs your task in separate, disposable pi subprocesses — from a single specialist to a parallel fan-out, a generate-and-critique loop, or a full decompose-and-synthesize — so heavy exploration and verification happen in clean contexts instead of bloating your main session.
Why use it
- Keep your main context clean. Sub-agents explore, build, and review in their own subprocess and hand back a compact result — not a wall of tool output.
- Proven patterns, one tool.
single,parallel,chain,evaluate,vote,route, andorchestrate— each a named agent-design pattern (from Anthropic's Building Effective Agents, Andrew Ng, and Google's ADK), not an ad-hoc prompt. See Patterns. - Safe by default. Repo-controlled agent prompts fail closed in headless runs, secrets and home paths are redacted, inter-agent handoffs are scanned for prompt injection, concurrent write agents cannot share one checkout unless you opt in, and read-only agents (
recon,analyst) ship with no shell. See Safety model. - Bounded — including cost. Every run is capped on count, time, and nesting depth;
maxCostUsd/maxTokenscap total spend across the whole flow tree. - Inspectable. Structured errors that name the fix, an offline test suite, and OpenInference-shaped trace export you can read with
jqor any OpenTelemetry backend.
What it looks like
Load the extension in pi, then delegate a single read-only task:
Use flow with {"agent":"recon","task":"Find the API routes for billing"}
recon runs in its own subprocess and hands back just the findings. When you need a verified result instead of a single pass, reach for another mode:
{
"task": "Add a /health endpoint that returns 200 and a JSON status, with a test",
"evaluate": { "checkCommand": "npm test", "maxIterations": 3 }
}
The operator builds the change, a separate redteam critic judges the result, and npm test must exit 0 — the loop revises until both pass or it hits maxIterations. → Quickstart
Install
pi-flows runs inside the pi coding agent, so you install it as a pi package — no clone required.
Prerequisites: Node.js >=24, npm >=11, and the pi CLI >=0.78.0 on your PATH. Don't have pi? It ships in @earendil-works/pi-coding-agent:
npm i -g @earendil-works/pi-coding-agent
Install it with the pi CLI — from npm for the published release, or from GitHub to track main:
# From npm (recommended) — the published release
pi install npm:pi-flows
# Add -l to install into the current project only (.pi/settings.json)
pi install -l npm:pi-flows
# Or track the latest main straight from GitHub, no clone required
pi install git:github.com/Thulr/pi-flows
Reload pi with /reload (or restart it), then verify:
/flows version
Use flow with {"list":true}
Success looks like all nine bundled agents in the flow list output — recon, strategist, overwatch, operator, analyst, redteam, controller, commander, and debrief. If pi isn't found, see Troubleshooting → pi: command not found. → Quickstart
Run from a clone (development)
To hack on pi-flows or try unreleased main, work from a checkout:
git clone https://github.com/Thulr/pi-flows
cd pi-flows
npm ci
npm run preflight # verify the pi CLI is installed and on PATH
pi -e ./extensions/pi-flows/index.ts # load the local extension in pi
Inside pi, smoke-test with no model call:
/flows help
/flows status
Use flow with {"list":true}
Use flow with {"showConfig":true}
Or install your working copy as a package with pi install -l ./. See Development for the build/test loop and Contributing.
What it adds
flowtool: runs isolated pi subprocesses for single, parallel, chain, evaluate (generator-evaluator), vote, route, and orchestrate delegation./flowscommand: lists available flow agents and shows help/status/version output.- Bundled agents in
agents/:recon,strategist,overwatch,operator,analyst,redteam,controller,commander, anddebrief. - User agents:
~/.pi/agent/flow-agents/*.md. - Project agents:
.pi/flow-agents/*.mdwhenagentScope: "project"or"all"is used.
Safety model
Project-local agents are repo-controlled prompts. In interactive pi sessions, pi-flows asks before running them. In headless (non-UI) runs, pi-flows fails closed by default and refuses project-local agents unless you explicitly pass confirmProjectAgents:false after reviewing the files.
pi-flows also redacts secret-shaped content and home paths from returned content/details by default. Inter-agent handoffs — where one child's output becomes another child's prompt ({previous} in chain, the evaluate artifact, vote ballots, orchestrate findings) — are an indirect prompt-injection surface, so pi-flows strips invisible/bidi characters and flags instruction-override markers in that content before reuse, surfacing a warning rather than silently trusting it. See Privacy & telemetry.
Cost is bounded as well as count and time: pass maxCostUsd / maxTokens to cap cumulative spend across the whole flow tree (BUDGET_EXCEEDED once reached). Concurrent fan-out also refuses multiple write-capable agents in the same cwd (SHARED_WRITE_CWD) unless allowSharedWriteCwd:true is explicit. Read-only agents (recon, analyst) ship without a shell, so their read-only boundary is enforced by the toolset, not by prompt instructions alone.
flow tool quick reference
List
{ "list": true }
Show effective config
{ "showConfig": true }
Single
{ "agent": "recon", "task": "Find the API routes for billing" }
Parallel
{
"tasks": [
{ "agent": "recon", "task": "Find frontend auth code" },
{ "agent": "recon", "task": "Find backend auth code" }
],
"concurrency": 2
}
Defaults: concurrency=4 (per-call). maxParallelTasks is a fixed hard cap of 8, not a per-call input.
Chain
{
"task": "Add Redis caching to the session store",
"chain": [
{ "agent": "recon", "task": "Research this task: {task}" },
{ "agent": "strategist", "task": "Plan using this context:\n\n{previous}" }
]
}
Chain {previous} handoffs are capped, redacted, and scanned for injection before they become the next prompt.
Evaluate (generator-evaluator loop)
{
"task": "Add a /health endpoint that returns 200 and a JSON status, with a test",
"evaluate": {
"operator": { "agent": "operator" },
"redteam": { "agent": "redteam" },
"checkCommand": "npm test",
"maxIterations": 3
}
}
The operator builds against task; a separate redteam judges the artifact (not the builder's trace) and returns VERDICT: PASS or VERDICT: REVISE with critique. On REVISE the operator is re-shown its prior artifact plus the critique and revises in place. The loop revises until it passes or hits maxIterations (default 3, cap 8).
Two optional reliability levers: checkCommand is a deterministic gate (a shell command that must exit 0 — level-1 code assertions alongside the LLM critic; non-zero is an automatic REVISE), and redteam may be an array of critics (a decomposed panel — e.g. one per dimension; PASS requires all of them). See Flow reference.
Vote (parallelization / voting)
{
"task": "Is /^(a+)+$/ vulnerable to catastrophic backtracking?",
"vote": { "voters": [{ "agent": "recon" }, { "agent": "overwatch" }], "debrief": { "agent": "debrief" } }
}
Runs the same task across ≥2 voters (use different models to break correlated errors) and synthesizes one answer via the optional debrief aggregator. Without it, all answers are returned.
Route (classify → dispatch)
{ "task": "The billing webhook returns 500s in prod", "route": { "candidates": ["recon", "strategist", "overwatch"], "fallback": "recon" } }
The controller picks one candidate (ROUTE: <agent>) and runs it — or emits ROUTE: none when nothing fits, falling back instead of forcing a guess.
Orchestrate (decompose → fan out → synthesize)
{
"task": "Document how auth works across the codebase",
"returnContract": "Return sections for login, token refresh, session storage, and gaps.",
"requireEvidence": true,
"orchestrate": {
"recon": { "agent": "recon" },
"verify": { "agent": "overwatch" },
"verifyPolicy": "revise",
"maxSubtasks": 5
}
}
The commander splits the task into a JSON list of subtasks, recon workers run them in parallel, and the debrief agent merges the findings. An optional verify critic checks the merged answer against the goal in the same call. verifyPolicy:"note" keeps the verdict advisory, "fail" hard-fails on REVISE, and "revise" reruns debrief with the critique until pass or verifyMaxIterations.
Cost budget and tracing
Any mode accepts a cumulative spend ceiling and a trace sink:
{ "task": "...", "orchestrate": {}, "maxCostUsd": 0.50, "traceFile": "flow-trace.jsonl", "traceLabel": "release-gate" }
maxCostUsd / maxTokens cap total spend across the whole flow tree (BUDGET_EXCEEDED once reached). traceFile (or PI_FLOWS_TRACE_FILE) appends one OpenInference-shaped JSON span per child plus a root span — JSONL any OpenTelemetry backend, or a coding agent, can read. Summarize local traces with /flows report flow-trace.jsonl or npm run trace:report -- flow-trace.jsonl from a checkout.
Agent definition format
Create markdown files with YAML frontmatter:
---
name: my-agent
description: What this agent does
tools: read,grep,find,ls
tier: capable
---
System prompt for the delegated agent.
tier keeps agents portable — no vendor model is hard-coded. capable runs on your pi default model; fast runs on PI_FLOWS_FAST_MODEL if you set one (e.g. a cheaper model for your provider, like openai-codex/gpt-5.4-mini), otherwise your default too. So flows use whatever model you have pi set up with, and the extension never needs updating as providers ship new models. Pin an explicit model: to override the tier (a flow-call model overrides too). tools: none disables built-in tools. Omitting tools uses pi defaults. Invalid agent files are reported in /flows status and flow showConfig:true.
Documentation ladder
- Quickstart
- Flow reference
- Patterns
- Troubleshooting
- Privacy & telemetry
- Examples
- Contributing
- Agent instructions
- Changelog
Development
npm ci
npm run check
Useful individual checks:
npm run typecheck
npm test
npm run validate:agents
npm run pack:dry-run