ultimate-pi
Ultimate AI coding harness for pi.dev — extensible skills, Obsidian wiki knowledge layer, compressed context, deterministic output
Package details
Install ultimate-pi from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:ultimate-pi- Package
ultimate-pi- Version
0.2.2- Published
- May 15, 2026
- Downloads
- 823/mo · 563/wk
- Author
- aryaniyaps
- License
- MIT
- Types
- extension, skill, prompt
- Size
- 2.9 MB
- Dependencies
- 7 dependencies · 0 peers
Pi manifest JSON
{
"extensions": [
"./.pi/extensions",
"./.pi/providers"
],
"skills": [
"./.agents/skills"
],
"prompts": [
"./.pi/prompts"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README

The ultimate AI coding harness on top of pi.dev.
What this project is
ultimate-pi is a production-oriented harness for AI-assisted coding with strict safety and governance built in.
It gives you:
- A phase-based workflow (
plan -> execute -> evaluate -> adversary -> merge) - Enforcement that blocks unsafe behavior (for example, mutating code before planning)
- Structured artifacts in
.pi/harness/for auditability and replay - A practical bootstrap command that sets up tools, graph, and runtime integrations
If you are new: start with the Quick Start section and run one task through the full pipeline.
5-minute quickstart
If you just want to get started fast:
- Install into your current project:
pi install npm:ultimate-pi
/reload
- Bootstrap the harness:
/harness-setup
- Run your first task:
/harness-auto "implement feature X safely"
That command runs the strict pipeline:
plan -> execute -> evaluate -> adversary -> policy decision.
If it blocks, inspect with:
/harness-trace-last
/harness-policy-status
Table of Contents
- 5-minute quickstart
- How the harness works
- Prerequisites
- Quick Start (new users)
- Run your first harness task
- Command reference
- Harness artifacts and file layout
- Safety and governance defaults
- Router tuning flow
- Troubleshooting
- Contributing
How the harness works
The harness enforces a deterministic execution lifecycle:
- Plan
Create aPlanPacketbefore any mutating work. - Execute
Implement only within the approved plan scope. - Evaluate
Run independent evaluation and produce anEvalVerdict. - Adversary
Run adversarial review and produce anAdversaryReport. - Policy / Merge decision
Debate consensus + severity policy decidespass,conditional_pass,block, orhuman_required.
Why this matters
- You get fewer silent mistakes.
- Reviews are reproducible, not opinion-only.
- Incidents and overrides are recorded in structured, machine-readable artifacts.
Prerequisites
Minimum recommended environment:
node >= 18npm >= 9gitpython >= 3.10(for Graphify workflow)
Optional but commonly used:
ghCLI for GitHub workflow- Docker (only if you want self-hosted Firecrawl)
Quick Start (new users)
From your project folder:
pi install npm:ultimate-pi
/reload
Run the full bootstrap:
/harness-setup
/harness-setup is idempotent and designed as the one-command initializer for:
- Graphify knowledge graph setup
- CLI tool installation and checks
- Harness/runtime directory scaffolding
- Extension package verification
- Model-router bootstrap configuration
Run your first harness task
Fastest path
Use the one-command pipeline:
/harness-auto "implement feature X safely"
This runs:
plan -> execute -> evaluate -> adversary -> policy decision -> commit/PR (no auto-merge)
Manual path (recommended for learning)
- Plan
/harness-plan "implement feature X safely"
- Execute with approved plan:
/harness-run --plan <path-to-plan-packet.json>
- Evaluate:
/harness-eval --run <run-id>
/harness-review --run <run-id>
- Adversarial review:
/harness-critic --run <run-id>
- If blocked or ambiguous, record incident:
/harness-incident --run <run-id> --trigger "<reason>"
- Trace/debug:
/harness-trace --run <run-id>
Command reference
Core workflow commands
/harness-setup- bootstrap complete environment and harness scaffolding/harness-auto "<task>"- run strict end-to-end pipeline/harness-plan "<task>"- generate read-onlyPlanPacket/harness-run --plan <file>- execute approved scope only/harness-eval --run <run-id>- benchmark/evaluation summary/harness-review --run <run-id>- independent evaluator verdict/harness-critic --run <run-id>- adversarial findings and merge-block signal/harness-incident --run <run-id> --trigger "<reason>"- incident record/harness-trace --run <run-id>- replay and artifact completeness/harness-abort [reason]- reset safely to plan phase and lock mutation until new plan
Operational/status commands
/harness-policy-status/harness-budget-status/harness-review-integrity-status/harness-test-integrity-last/harness-trace-last/harness-debate-open/harness-debate-round/harness-debate-consensus
Harness artifacts and file layout
Primary harness directories:
.pi/harness/specs/- JSON schemas for core contracts.pi/harness/runs/- per-run trace summaries + event indexes.pi/harness/incidents/- incident and policy override records.pi/harness/debates/- debate rounds, consensus packets, budget events.pi/harness/router/- router tuning proposals and apply flow scripts
Core contract schemas in .pi/harness/specs/:
PlanPacketRunTraceEvalVerdictAdversaryReportRoundResultConsensusPacketBudgetExhaustedIncidentRecordRouterTuningProposal
Safety and governance defaults
The harness intentionally locks in these behaviors:
- Plan-before-mutate: write/edit/mutating shell commands blocked outside execute phase
- Mandatory adversarial review in the strict pipeline
- Review isolation: evaluator/adversary cannot share executor session context
- Budget hard-stops with structured
budget_exhaustedevents - Test-diff integrity checks for suspicious test weakening patterns
- Severity policy thresholds:
- block if
security >= 0.70orcorrectness >= 0.70 - block if
architecture >= 0.80ortest_integrity >= 0.80
- block if
- Override policy: single human approver with explicit justification
- Never auto-merge
Router tuning flow
Router changes are two-step and approval-gated:
- Propose (no live mutation):
node .pi/harness/router/propose-router-tuning.mjs \
--evidence /path/to/evidence.json \
--candidate /path/to/candidate-router.json \
--proposal-out .pi/harness/router/proposals/proposal-001.json
- Apply (explicit human approval + justification +
--write):
node .pi/harness/router/apply-router-proposal.mjs \
--proposal .pi/harness/router/proposals/proposal-001.json \
--approve-by "human.name" \
--justification "why this is safe" \
--write
Blind writes to .pi/model-router.json are intentionally disallowed.
Troubleshooting
/harness-setup fails early
- Check
node --version,npm --version,git --version - Ensure Node is at least 18
Graphify not available
- Install Python 3.10+
- Then install Graphify and build/update graph
Review/integrity blocks in evaluate/adversary phase
- This means review is not isolated from execute context
- Fork/switch session, then rerun review commands
Budget hard-stop triggers
- Use
/harness-budget-status - Reduce scope, split task, or restart with a narrower plan
Suspicious test diff warning
- Use
/harness-test-integrity-last - Restore or justify test changes; expect adversarial scrutiny
Contributing
For local dev setup, lint/test commands, Firecrawl notes, extension details, and architectural quality gate workflow, see: