pi-captain

Captain — multi-agent pipeline orchestrator for pi. Define specialized agents, wire them into sequential/parallel/pool pipelines with quality gates, and run complex workflows.

Package details

← Back

extensionskill

Install pi-captain from npm and Pi will load the resources declared by the package manifest.

npm report

$ pi install npm:pi-captain

Package: pi-captain
Version: 0.1.1
Published: Mar 13, 2026
Downloads: 31/mo · 8/wk
Author: pierre-mike
License: MIT
Types: extension, skill
Size: 1.1 MB
Dependencies: 1 dependency · 6 peers

Pi manifest JSON

{
  "extensions": [
    "./extensions/captain"
  ],
  "skills": [
    "./skills"
  ],
  "image": "https://raw.githubusercontent.com/Pierre-Mike/pi-captain/main/assets/preview.png"
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-captain

⚠️ This is not production ready — just an experimentation.

Pipeline orchestrator for pi. Wire steps into sequential/parallel/pool pipelines with quality gates and run complex workflows — each step declares its own model, tools, and temperature inline.

Install

# Project-local (recommended — auto-installs for teammates)
pi install -l git:github.com/Pierre-Mike/pi-captain

# Global
pi install git:github.com/Pierre-Mike/pi-captain

Package Contents

Extensions

Path	Description
`extensions/captain/`	Captain pipeline orchestrator — all tools & commands
`extensions/native-web-search/`	`web_search` tool — live internet search via Anthropic's web search beta
`extensions/agent-loop/`	`/loop` command — repeat agent turns by goal, count, or pipeline stages
`extensions/clear/`	`/clear` and `/reset` commands — wipe session and reload
`extensions/terminal/`	`/terminal` command — open a terminal split in your editor
`extensions/zellij-tab-namer/`	Auto-renames the Zellij tab after each agent turn
`extensions/refactor-loop/`	`/refactor` command — iterative analyze→refactor→verify cycles with `refactor_pass` tool
`extensions/safety-destructive-commands/`	Blocks/confirms dangerous bash commands (`rm -rf`, `dd`, fork bombs…)
`extensions/safety-git-operations/`	Confirms risky git ops (`push --force`, `reset --hard`, `clean -f`…)
`extensions/safety-network-exfiltration/`	Blocks curl-pipe-to-shell, secret leaks, sensitive file transfers
`extensions/safety-path-protection/`	Protects `.git/`, `node_modules/`, `.env`, SSH keys from writes
`extensions/freecad/`	`freecad_*` tools — drive FreeCAD to create/export 3D models

Skills

Path	Description
`skills/captain/`	Captain skill — guides the LLM on pipeline authoring
`skills/research-swarm/`	Parallel 5-agent research with democratic scoring
`extensions/refactor-loop/refactor-loop/` (bundled in extension)	Refactor loop workflow instructions
`skills/json-canvas/`	JSON Canvas format for Obsidian `.canvas` files
`skills/extension-generator/`	Build and debug pi extensions
`skills/skill-generator/`	Generate new pi skills
`extensions/freecad/skill/`	FreeCAD agent wrapper and shell runner

Selective Install

Don't want everything? Use the object form in your settings.json to load only what you need.

Extension only (no skill):

{
  "packages": [
    {
      "source": "git:github.com/Pierre-Mike/pi-captain",
      "skills": []
    }
  ]
}

Skill only (no extension tools):

{
  "packages": [
    {
      "source": "git:github.com/Pierre-Mike/pi-captain",
      "extensions": []
    }
  ]
}

Or use pi config after installing to interactively toggle extensions and skills on/off.

What You Get

Tools

Tool	Description
`captain_load`	Load a builtin pipeline preset or `.ts` pipeline file
`captain_run`	Execute a pipeline with input
`captain_status`	Check pipeline progress, tokens, cost, and gate results
`captain_list`	List all defined pipelines
`captain_generate`	Generate a TypeScript pipeline file on-the-fly using LLM
`captain_validate`	Validate a pipeline specification for structural correctness

Builtin Pipeline Presets

Preset	Description
`githubPrReview`	GitHub PR review pipeline
`reqDecompose`	Requirements decomposition
`reqDecomposeAi`	AI-powered requirements decomposition
`requirementsGathering`	Requirements gathering workflow
`researchSwarm`	Research swarm coordination
`showcase`	Self-contained demo exercising all features
`shredder`	Document shredding/analysis
`specTdd`	Specification-driven TDD

Pipelines as TypeScript Files

The preferred way to write pipelines is as .ts files that export a pipeline const of type Runnable. Gates, OnFail handlers, and Transforms are plain functions — no JSON encoding needed.

// my-pipeline.ts
import { retry, skip, warn } from "<captain>/gates/on-fail.js";
import { bunTest, command, regexCI, user } from "<captain>/gates/presets.js";
import { llmFast } from "<captain>/gates/llm.js";
import { full, summarize } from "<captain>/transforms/presets.js";
import type { Runnable, Step } from "<captain>/types.js";

const research: Step = {
  kind: "step",
  label: "Research",
  model: "sonnet",
  tools: ["read", "bash"],
  prompt: "Research the following topic thoroughly:\n$ORIGINAL",
  gate: undefined,
  onFail: skip,
  transform: full,
};

const implement: Step = {
  kind: "step",
  label: "Implement",
  model: "sonnet",
  tools: ["read", "bash", "edit", "write"],
  prompt: "Based on this research:\n$INPUT\n\nImplement: $ORIGINAL",
  gate: bunTest,           // runs `bun test`, passes on exit 0
  onFail: retry(3),
  transform: full,
};

const review: Step = {
  kind: "step",
  label: "Review",
  model: "flash",
  tools: ["read", "bash"],
  temperature: 0.3,
  prompt: "Review this implementation:\n$INPUT\n\nOriginal: $ORIGINAL",
  gate: user,              // human approval in interactive UI
  onFail: skip,
  transform: summarize(),
};

export const pipeline: Runnable = {
  kind: "sequential",
  steps: [research, implement, review],
};

Load and run:

captain_load: action="load", name="./my-pipeline.ts"
captain_run: name="my-pipeline", input="Build a REST API for user management"

.pi/pipelines/ convention: User pipeline files placed in .pi/pipelines/ should start with two header comments so the name and description are discoverable without importing the module (this is required for captain_generate output):
// @name: my-pipeline-name
// @description: One-line description of what this pipeline does
These files are auto-discovered by captain_load (action: "list") and /captain-load.

Type Reference

`Runnable` (union)

A Runnable is anything that can be placed inside a pipeline. All four variants are infinitely nestable.

Runnable = Step | Sequential | Pool | Parallel

`Step` — atomic LLM invocation

Each step runs as an in-process pi SDK session. All config is declared inline on the step.

{
  kind: "step",                    // required — literal "step"
  label: string,                   // required — human-readable name shown in UI
  prompt: string,                  // required — instructions for the step
                                   //   $INPUT    → output of the previous step (or user input on step 1)
                                   //   $ORIGINAL → the original user request, always unchanged

  // ── Step config ───────────────────────────────────────────────────────
  model?: string,                  // optional — model identifier; default: current session model
                                   //   Examples: "sonnet", "flash", "claude-opus-4-5"
  tools?: string[],                // optional — tool names to enable
                                   //   Default: ["read","bash","edit","write"]
                                   //   Available: "read" | "bash" | "edit" | "write" | "grep" | "find" | "ls"
  temperature?: number,            // optional — sampling temperature (0–1)
  systemPrompt?: string,           // optional — system prompt for the LLM session
  skills?: string[],               // optional — absolute paths to .md skill files to inject
  extensions?: string[],           // optional — absolute paths to .ts extension files to load
  jsonOutput?: boolean,            // optional — if true, instructs step to return structured JSON; default: false

  // ── Step metadata ─────────────────────────────────────────────────────
  description?: string,            // optional — longer description (defaults to label)

  // ── Lifecycle ─────────────────────────────────────────────────────────
  gate?: Gate,                     // optional — validation after this step runs
  onFail?: OnFail,                 // optional — what to do if gate fails or step errors
  transform: Transform,            // required — how to pass output to the next step
}

Example step (TypeScript):

import { bunTest } from "<captain>/gates/presets.js";
import { retry } from "<captain>/gates/on-fail.js";
import { full } from "<captain>/transforms/presets.js";

const buildStep: Step = {
  kind: "step",
  label: "Build & Test",
  model: "sonnet",
  tools: ["read", "bash", "edit", "write"],
  prompt: "Implement $ORIGINAL. Make all tests pass.",
  gate: bunTest,
  onFail: retry(3),
  transform: full,
};

`Gate` — plain validation function

A gate is a plain function that receives the step output and optional side-effect context.
Return true to pass, or a string describing why it failed. Throwing is also treated as a failure.

type Gate = (params: {
  output: string;
  ctx?: GateCtx;
}) => true | string | Promise<true | string>;

Inline gates:

// Simple content check
gate: ({ output }) => output.includes("DONE") ? true : 'Output must contain "DONE"'

// JSON validity check
gate: ({ output }) => {
  try { JSON.parse(output.trim()); return true; }
  catch { return "Output is not valid JSON"; }
}

// Shell command via ctx
gate: async ({ ctx }) => {
  const { code, stderr } = await ctx!.exec("bash", ["-c", "bun test"]);
  return code === 0 ? true : `Tests failed: ${stderr.slice(0, 200)}`;
}

// Stateful gate using closure
let attempts = 0;
gate: ({ output }) => {
  attempts++;
  return attempts >= 3 ? true : `Need 3 attempts, got ${attempts}`;
}

Gate presets (import from gates/presets.js):

Export	Description
`command(cmd)`	Run shell command - exit 0 passes, non-zero fails
`file(path)`	File must exist
`regexCI(pattern)`	Output must match regex (case-insensitive)
`allOf(...gates)`	All gates must pass
`user`	Require human confirmation via interactive UI
`bunTest`	Preset: run `bun test`

LLM gate (import from gates/llm.js):

Export	Description
`llmFast(prompt, threshold?)`	LLM evaluates output quality (0–1 threshold, default 0.7)

import { llmFast } from "<captain>/gates/llm.js";
gate: llmFast("Is this implementation production-ready?", 0.8)

`OnFail` — plain failure-handling function

An OnFail is a plain function that receives failure context and returns what to do next.

type OnFail = (ctx: OnFailCtx) => OnFailResult | Promise<OnFailResult>;

interface OnFailCtx {
  reason: string;      // Gate failure reason
  retryCount: number;  // Retries already attempted (0 on first failure)
  stepCount: number;   // Total times step has run (retryCount + 1)
  output: string;      // Last output before failure
}

type OnFailResult =
  | { action: "retry" }
  | { action: "fail" }
  | { action: "skip" }
  | { action: "warn" }
  | { action: "fallback"; step: Step };

OnFail presets (import from gates/on-fail.js):

Export	Description
`retry(max?)`	Re-run up to N times (default 3), then fail
`retryWithDelay(max, delayMs)`	Retry with pause between attempts
`fallback(step)`	Run an alternative step instead
`skip`	Skip scope - mark as skipped, continue with empty output
`warn`	Log warning but treat as passed and continue

Custom inline:

// Retry twice, then warn
onFail: ({ retryCount }) => retryCount < 2 ? { action: "retry" } : { action: "warn" }

When to use warn vs skip:

warn: Gate failed but output is still useful — pass it through. Good for advisory gates.
skip: Gate failed and output is unreliable — discard it. Good for mandatory validation.

`Transform` — plain output-shaping function

A transform is a plain function that maps one step's output to the next step's input.

type Transform = (params: {
  output: string;    // Raw output produced by the step
  original: string;  // The very first pipeline input ($ORIGINAL)
  ctx: TransformCtx; // Side-effect helpers (shell, LLM, …)
}) => string | Promise<string>;

Transform presets (import from transforms/presets.js):

Export	Description
`full`	Pass entire output unchanged (default)
`extract(key)`	Parse JSON and extract a top-level key
`summarize()`	Ask LLM to summarize in 2–3 sentences

Inline transforms:

// Trim whitespace
transform: ({ output }) => output.trim()

// Pull JSON key with fallback
transform: ({ output }) => {
  try { return JSON.parse(output).result; }
  catch { return output; }
}

// Shell post-processing
transform: async ({ output, ctx }) => {
  const { stdout } = await ctx.exec("jq", ["-r", ".items[]"]);
  return stdout || output;
}

`Sequential` — chain steps via `$INPUT`

{
  kind: "sequential",
  steps: Runnable[],      // ordered list of steps/sub-pipelines
  gate?: Gate,            // validates final output of the sequence
  onFail?: OnFail,        // retry = re-run entire sequence from scratch
  transform?: Transform,  // applied to final output after gate passes
}

`Parallel` — different steps concurrently

{
  kind: "parallel",
  steps: Runnable[],                 // each runs concurrently (own git worktree)
  merge: MergeFn,                    // how to combine branch outputs
  gate?: Gate,
  onFail?: OnFail,
  transform?: Transform,
}

`Pool` — same step × N

{
  kind: "pool",
  step: Runnable,                    // replicated N times
  count: number,
  merge: MergeFn,                    // how to combine branch outputs
  gate?: Gate,
  onFail?: OnFail,
  transform?: Transform,
}

`MergeFn` — combining parallel/pool outputs

MergeFn is a plain function: (outputs: string[], ctx: MergeCtx) => string | Promise<string>.

Import named presets from merge.js:

import { concat, awaitAll, firstPass, vote, rank } from "<captain>/merge.js";

Preset	Behaviour
`concat`	Concatenate all outputs in order
`awaitAll`	Wait for all, return concatenated (alias for `concat`)
`firstPass`	Return the first non-empty output
`vote`	LLM picks the single best output
`rank`	LLM ranks all outputs and synthesizes the top one

You can also write inline merge functions:

merge: (outputs) => outputs.join("\n---\n")

Complete Pipeline Example (TypeScript)

// research-and-build.ts
import { retry, skip, warn } from "<captain>/gates/on-fail.js";
import { bunTest, allOf, file, user } from "<captain>/gates/presets.js";
import { llmFast } from "<captain>/gates/llm.js";
import { concat } from "<captain>/merge.js";
import { full, summarize } from "<captain>/transforms/presets.js";
import type { Runnable } from "<captain>/types.js";

export const pipeline: Runnable = {
  kind: "sequential",
  steps: [
    {
      kind: "step",
      label: "Explore codebase",
      model: "flash",
      tools: ["read", "bash"],
      prompt: "Explore the codebase and understand how to implement: $ORIGINAL. Identify relevant files, patterns, and constraints.",
      onFail: skip,
      transform: full,
    },
    {
      kind: "parallel",
      steps: [
        {
          kind: "step",
          label: "Write tests",
          model: "sonnet",
          tools: ["read", "bash", "edit", "write"],
          temperature: 0.2,
          prompt: "Based on this analysis:\n$INPUT\n\nWrite failing tests for: $ORIGINAL",
          gate: async ({ ctx }) => {
            const { code } = await ctx!.exec("bash", ["-c", "bun test 2>&1 | grep -q fail"]);
            return code === 0 ? true : "Tests should fail (red phase)";
          },
          onFail: retry(2),
          transform: full,
        },
        {
          kind: "step",
          label: "Write docs",
          model: "sonnet",
          tools: ["read", "bash", "edit", "write"],
          prompt: "Based on this analysis:\n$INPUT\n\nDraft documentation for: $ORIGINAL",
          onFail: warn,
          transform: full,
        },
      ],
      merge: concat,
    },
    {
      kind: "step",
      label: "Implement",
      model: "sonnet",
      tools: ["read", "bash", "edit", "write"],
      temperature: 0.2,
      prompt: "Context:\n$INPUT\n\nImplement: $ORIGINAL\nMake all tests pass.",
      gate: bunTest,
      onFail: retry(3),
      transform: full,
    },
    {
      kind: "step",
      label: "Review",
      model: "flash",
      tools: ["read", "bash"],
      temperature: 0.3,
      prompt: "Review the implementation for $ORIGINAL. Focus on correctness, security, and maintainability.",
      gate: llmFast("Does the review indicate the implementation is ready for production?", 0.8),
      onFail: retry(1),
      transform: summarize(),
    },
  ],
};

Quick Start

# Load and run a builtin preset
> Use captain to review my PR

# Load a custom TypeScript pipeline
> captain_load: name="./my-pipeline.ts"
> captain_run: name="my-pipeline", input="refactor the auth module"

# Single-step ad-hoc via /captain-step
> /captain-step "analyze this codebase" --model flash --tools read,bash

Development

git clone https://github.com/Pierre-Mike/pi-captain.git
cd pi-captain
npm install

Scripts

Script	Description
`npm run check`	Lint & format check
`npm run fix`	Auto-fix lint & format issues
`npm test`	Run all tests

License

MIT

pi-captain

Install

Package Contents

Extensions

Skills

Selective Install

What You Get

Tools

Builtin Pipeline Presets

Pipelines as TypeScript Files

Type Reference

Runnable (union)

Step — atomic LLM invocation

Gate — plain validation function

OnFail — plain failure-handling function

Transform — plain output-shaping function

Sequential — chain steps via $INPUT

Parallel — different steps concurrently

Pool — same step × N

MergeFn — combining parallel/pool outputs

Complete Pipeline Example (TypeScript)

Quick Start

Development

Scripts

License

`Runnable` (union)

`Step` — atomic LLM invocation

`Gate` — plain validation function

`OnFail` — plain failure-handling function

`Transform` — plain output-shaping function

`Sequential` — chain steps via `$INPUT`

`Parallel` — different steps concurrently

`Pool` — same step × N

`MergeFn` — combining parallel/pool outputs