@architectit/pi-guardrails
Four Laws guardrails enforcement for pi coding agent — standalone + MCP bridge
Package details
Install @architectit/pi-guardrails from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:@architectit/pi-guardrails- Package
@architectit/pi-guardrails- Version
0.2.0- Published
- May 20, 2026
- Downloads
- 48/mo · 48/wk
- Author
- architectit
- License
- MIT
- Types
- extension, skill
- Size
- 225.2 KB
- Dependencies
- 1 dependency · 2 peers
Pi manifest JSON
{
"extensions": [
"./index.ts"
],
"skills": [
"./skills"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
@architectit/pi-guardrails
Four Laws guardrails enforcement for the pi coding agent. Works standalone (no MCP server required) and can bridge to the existing Go MCP server when available.
Installation
pi install npm:@architectit/pi-guardrails
Or manually:
npx @architectit/pi-guardrails
Architecture
Hybrid design: The extension operates in two modes:
- Standalone (default): All enforcement runs locally within the pi extension. No external server needed.
- MCP Bridge: When the Go MCP server is available, tools proxy to it for enhanced enforcement.
The standalone mode is the primary value proposition — teams don't need to run the MCP server to get guardrails protection.
The Four Laws of Agent Safety
- Read Before Editing — The agent must read a file before editing it
- Stay in Scope — The agent only operates on authorized file paths
- Verify Before Committing — Changes must be verified before committing
- Halt When Uncertain — The Three Strikes rule: 3 consecutive failures triggers a halt
Tools (28 registered)
Core Enforcement
| Tool | Purpose |
|---|---|
guardrail_init |
Initialize a guardrails session |
guardrail_record_read |
Mark a file as read (Law 1) |
guardrail_verify_read |
Check if a file was read before editing |
guardrail_set_scope |
Define authorized file paths (Law 2) |
guardrail_check_scope |
Check if a path is in scope |
guardrail_record_attempt |
Record a task attempt result (Law 4) |
guardrail_check_strikes |
Check strike count for a task |
guardrail_reset_strikes |
Reset strikes after resolution |
guardrail_check_halt |
Evaluate halt conditions (includes uncertainty score) |
guardrail_log_violation |
Log a guardrail violation |
guardrail_status |
Get current session status |
guardrail_acknowledge_halt |
Acknowledge a halt condition to resume |
Language & Pattern Rules
| Tool | Purpose |
|---|---|
guardrail_detect_language |
Auto-detect project languages |
guardrail_get_language_profile |
Get language profile with available rules |
guardrail_check_pattern |
Check code against prevention pattern rules |
guardrail_list_languages |
List available language rule sets |
guardrail_list_skills |
List all guardrails skills |
guardrail_read_skill |
Read a skill's documentation |
Regression & Validation
| Tool | Purpose |
|---|---|
guardrail_check_regression |
Check if file edits risk regressing past failures |
guardrail_verify_fixes |
Verify that past fixes are still intact |
guardrail_register_failure |
Register a failure in the cross-session registry |
guardrail_validate_replacement |
Validate edit old_content matches actual file |
guardrail_validate_git |
Validate git operations (branch protection, force-push) |
Planning & Scope
| Tool | Purpose |
|---|---|
guardrail_pre_work_check |
Generate pre-work risk checklist |
guardrail_detect_creep |
Detect feature creep against authorized scope |
guardrail_mcp |
Proxy to MCP server (when connected) |
Automatic Enforcement
The extension registers event handlers that enforce the Four Laws automatically:
- Read tracking: File reads are tracked via
tool_resultevents - Pre-edit enforcement: Edits to unread files are blocked (Law 1)
- Scope enforcement: Edits outside the authorized scope are blocked (Law 2)
- Bash safety: Dangerous commands (
rm -rf /,git push --force,sudo, etc.) are blocked - Injection defense: Scans tool inputs for prompt injection patterns
- Output validation: Detects secrets and PII in tool output
- Content filtering: Detects denied topics in output (warn-only)
- Canary tokens: Detects data exfiltration via embedded tokens (warn-only)
- Permission system: Per-tool permission levels (auto/ask/blocked)
- Halt lifecycle: Blocked operations record halt state; requires acknowledgment to resume
Language-Specific Rules
Auto-detects project languages and loads prevention rules from .guardrails/prevention-rules/languages/:
| Language | Rules | Examples |
|---|---|---|
| Python | 8 | eval/exec, subprocess shell=True, bare except, pickle, SQL injection |
| TypeScript | 7 | any type, non-null assertion, eval, innerHTML, hardcoded secrets |
| Go | 6 | ignored errors, panic, SQL concat, goroutine without context |
| Rust | 6 | unsafe blocks, unwrap, panic!, todo!, raw pointer deref |
Add new languages by creating a JSON file in .guardrails/prevention-rules/languages/.
Configuration
Config file: ~/.pi/agent/extensions/pi-guardrails/config.json
{
"mcpBinaryPath": "",
"enabledRules": ["four-laws", "three-strikes", "scope-validator"],
"autoRegister": true,
"defaultScope": [],
"maxStrikes": 3,
"statusBarEnabled": true,
"panelAutoOpen": false,
"toolPermissions": {
"defaultLevel": "auto",
"tools": {
"bash": "ask",
"write": "auto",
"edit": "auto",
"read": "auto"
}
},
"injectionDefense": {
"blockThreshold": 0.8,
"warnThreshold": 0.5,
"heuristicEnabled": true
},
"outputValidation": {
"enablePII": false,
"autoRedact": false,
"redactionText": "[REDACTED]",
"contentFilter": {
"deniedTopics": ["malicious code"],
"allowedTopics": [],
"strictMode": false
}
},
"canary": {
"prefix": "CATALOG:",
"tokenLength": 32
},
"gitPolicy": {
"protectedBranches": ["main", "master"],
"commitFormat": "conventional",
"requireAIAttribution": true
}
}
Environment variables:
PI_GUARDRAILS_MCP_API_KEY— API key for the MCP server
Halt Lifecycle
When a handler blocks an operation, a halt is recorded:
- active → operation attempted
- halted → handler blocked, reason recorded
- acknowledged →
guardrail_acknowledge_haltcalled after review
Uncertainty Scoring
guardrail_check_halt returns an uncertaintyScore (0-1):
| Score | Level | Meaning |
|---|---|---|
| 0-0.2 | Certain | No concerns |
| 0.2-0.5 | Probably | Mild uncertainty (e.g. edit without details) |
| 0.5-0.8 | Uncertain | Significant concern (e.g. delete without details) |
| 0.8-1.0 | Guessing | High risk (e.g. production-affected operations) |
Status Bar
When enabled, shows a compact status in the pi status bar:
g: ok— no issuesg: !!2/3 src/ mcp:*— 2/3 strikes, scope set tosrc/, MCP connectedg: src/ !3v mcp:.— scope set, 3 violations, MCP disconnected
TUI Dashboard
Open the guardrails overlay with:
/guardrails
The panel shows safety score, Four Laws status, strike tracker, scope, and MCP connection status.
Close with Esc or q. Scroll with j/k.
MCP Bridge
When the Go MCP server is available, the extension can proxy calls to it for enhanced enforcement:
- Configure the server endpoint in
config.jsonundermcpBinaryPath(URL for SSE, command for stdio) - Initialize a session with
guardrail_init— the extension auto-connects - Use
guardrail_mcpwith anactionparameter to call any MCP server tool - Reconnection uses exponential backoff (1s base, 30s max, 5 attempts)
Storage
All state is stored under ~/.pi/agent/extensions/pi-guardrails/:
sessions/— session state JSON filesviolations.jsonl— append-only violation log.guardrails/regression/failure-registry.jsonl— cross-session failure registryconfig.json— user configuration
22 Code Modules
FileReadStore, ScopeValidator, StrikeCounter, HaltChecker, ViolationLog, SessionStore, InjectionDetector, OutputValidator, ContentFilter, CanaryTokenManager, PermissionManager, PolicyLoader, MCPClient, PreWorkChecker, FeatureCreepDetector, PatternRuleEngine, GitValidator, LanguageDetector, RegressionGuard, ExactReplacementValidator, SandboxRunner, GuardrailsPanel
CI/CD Integration
Pre-commit Hook
cp guardrails/pre-commit.sh .git/hooks/pre-commit
chmod +x .git/hooks/pre-commit
GitHub Actions
The .github/workflows/pi-guardrails-ci.yml workflow runs on PRs:
- Unit tests for pi-extension and guardrails modules
- Secret scanning on changed files
- Scope compliance validation