pi-atelier: Making AI Coding Assistants Professional
Who Is This Book For?
This book is for you if you’re doing any of the following:
- Writing code with AI coding assistants (pi, Cursor, Copilot, etc.)
- Feeling like your AI assistant is “almost” good enough
- Wanting to evolve AI from a “Q&A tool” into a “project partner”
What Is pi-atelier?
pi-atelier is a set of pi extensions that give AI coding assistants project management capabilities.
A regular AI assistant can write code, but:
- It forgets everything between sessions
- It tends to go off-track on large tasks
- It has no rules, making silly mistakes easily
- It gets dumber as the conversation grows longer
pi-atelier extensions fill these gaps:
🧠 Memory
Let AI retain knowledge across sessions
📋 Planning
Manage three-tier roadmaps: Epic → Story → Task
🛡️ Shepherd
Set rules for AI to prevent mistakes
🔍 Diagnostics
Control context quality + token consumption analysis
📊 Analysis
Search and revisit historical sessions
🗜️ Compression
Keep AI sharp in long sessions
For a detailed comparison, see the table below:
| Capability | Extension | One-Line Description |
|---|---|---|
| Memory | pi-memory | Let AI retain knowledge across sessions |
| Planning | pi-roadmap | Let AI manage Epic → Story → Task |
| Shepherd | pi-shepherd | Set rules for AI to prevent mistakes |
| Context & Diagnostics | pi-context-manager | Control what AI sees + token consumption diagnostics |
| Journal | pi-journal | Generate log reports (git activity + session events + memory changes) |
| Analysis | pi-session-analyzer | Search and revisit historical sessions |
| Compression | pi-smart-compact | Keep AI smart in long sessions |
| Scheduling | pi-scheduler | Timed reminders and recurring tasks |
| Workflow | pi-workflow | Sub-agent orchestration, parallel execution |
| Tool Library | pi-shared-utils | Common utility functions for extension development |
Reading Path
Quick Start Path (1 hour)
- Chapter 1: An AI’s Memory → Install pi-memory in 5 minutes
- Chapter 2: From Memory to Planning → Learn to manage tasks with roadmaps
- Chapter 7: Build Your Own Extension → Understand the extension mechanism
Comprehensive Path (3 hours)
Read all chapters in order. Each chapter includes:
- Pain Point: Real problems you will definitely encounter
- How It Works: How the extension works internally
- Use Cases: Real-world scenarios
- Best Practices: How to use it better
On-Demand Reference
When facing a specific problem, jump directly to the relevant chapter. Each chapter is self-contained.
Quick Install
Add the extensions you need to pi’s settings.json:
{
"packages": [
"pi-memory",
"pi-roadmap",
"pi-shepherd",
"pi-context-manager",
"pi-session-analyzer",
"pi-smart-compact",
"pi-scheduler"
]
}
Or install everything (pi-workflow and pi-shared-utils are development libraries; regular users don’t need to install them directly):
{
"packages": [
"pi-memory",
"pi-roadmap",
"pi-shepherd",
"pi-context-manager",
"pi-session-analyzer",
"pi-smart-compact",
"pi-scheduler",
"pi-workflow",
"pi-shared-utils"
]
}
Most extensions are ready to use out of the box — no additional configuration needed after installation (though you can customize as needed).
💡 Tip: pi-workflow and pi-shared-utils are development libraries used by other extensions; regular users generally don’t need to install them directly.
Important File Paths
Before you start, here are the key pi files you need to know:
| File | Path | Description |
|---|---|---|
| Global Config | ~/.pi/agent/settings.json | Install extensions, configure providers |
| Project Config | .pi/settings.json (project root) | Project-level custom configuration (overrides global) |
| Project Instructions | AGENTS.md (project root or .pi/agent/) | Project rules injected into the AI |
| Extension Install Dir | ~/.pi/agent/npm/node_modules/ | npm package installation location |
| Memory Directory | .pi/memory/ (project-level) | Project-level persistent memory |
| Global Memory | ~/.pi/agent/memory/ | Cross-project general memory |
💡 Newcomer Tip:
~refers to your home directory. On macOS/Linux it’s/home/your-username/, on Windows it’sC:\Users\your-username\.
Conventions
Examples in this book follow these conventions:
Code blocks: Commands, file paths, code snippets- Bold: Important concepts
-
💡 Tip: Practical tips and notes
- Tables: Quick comparisons and reference
Ready to get started? Flip to Chapter 1, and let’s begin with “memory.”
An AI’s Memory
You’ve Probably Seen This Before
You’re using an AI coding assistant to develop a project. On the first day, you spent half an hour explaining to the AI:
- The project uses Rust + Axum tech stack
- The database is DuckDB, not PostgreSQL
- The auth module uses JWT, not Session
- The deployment target is an ARM-based embedded device
The AI understood, and helped you write perfect code.
The next day, you start a new session and the AI has forgotten everything. It starts suggesting Express.js, connecting to PostgreSQL, using Session auth…
💡 This is the AI’s “goldfish memory” problem: every new session is a blank slate.
Memory: Giving AI Cross-Session Knowledge
pi-memory is the solution. It gives the AI a “notebook”:
- Auto-record: Architectural decisions, pitfalls encountered, and consensus reached during the session
- Auto-load: At the start of each new session, the AI automatically reads key knowledge from before
- Per-project isolation: Memories from different projects don’t interfere with each other
How It Works
┌─────────────────────────────────────────┐
│ AI Session │
│ │
│ ┌──────────┐ ┌──────────────────┐ │
│ │ User │ ──→ │ AI extracts │ │
│ │ Dialogue │ │ knowledge points │ │
│ └──────────┘ └────────┬─────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ memory_update │ │
│ │ Write memory │ │
│ │ file │ │
│ └─────────────────┘ │
│ │
│ ┌──────────┐ ┌──────────────────┐ │
│ │ New │ ──→ │ before_agent_start│ │
│ │ Session │ │ Auto-load memory │ │
│ │ Start │ │ index │ │
│ └──────────┘ └──────────────────┘ │
└─────────────────────────────────────────┘
Auto-Injection: Memory Index Loaded Every Turn
At the start of each session (the before_agent_start event), pi-memory automatically does the following:
- Reads
memory-prompt.md— Instructions for using the memory system (tells the AI there’s a memory feature and where the files are) - Reads the
MEMORY.mdindex — Global~/.pi/agent/memory/MEMORY.md+ project.pi/memory/MEMORY.md - Injects into the system prompt — The AI can see all memory titles and keywords right at the start of every turn
This means the AI doesn’t need to actively “look up” memories — the memory index is already in its context. When the AI sees a title like JS_replace_$陷阱, it knows that memory exists and can use the read tool to get the full content when needed.
⚠️ Only the index is injected, not the full content. MEMORY.md only contains titles and keywords, not complete memory content. The AI needs to
reada specific file to get the details.
The core structure of memory:
| Component | Role |
|---|---|
MEMORY.md | Index file, lists all memory titles and keywords (auto-injected every turn) |
Memory files (.md) | One topic per file, contains specific knowledge (read on demand) |
memory_update tool | AI uses to write/update memory files + auto-update MEMORY.md index |
memory_index tool | AI uses to view existing memories (manual query) |
before_agent_start hook | Auto-injects memory index into system prompt at session start |
Memory File Format
Each memory file follows a consistent format:
# Title
Keywords: `kw1` `kw2` `kw3` ...
## Content
- Knowledge point 1
- Knowledge point 2
- Decision record: why option A was chosen over option B
File naming convention: topic--keyword1,keyword2,keyword3.md
For example: database-choice--DuckDB,embedded,ARM,column-store,analytical-queries.md
Real-World Case: The First Week of a New Project
Let’s look at a real scenario. Say you’re developing a data analysis tool:
Day 1: Project Initialization
You and the AI discussed tech choices and decided on Python + FastAPI + DuckDB. At the end of the session, the AI automatically wrote a memory:
# Tech Stack Decision
Keywords: `Python` `FastAPI` `DuckDB` `technology-choice`
## Rationale
- FastAPI: Good async support, auto-generates API documentation
- DuckDB: Embedded analytical database, no separate deployment needed, suitable for single-machine analysis
- Not PostgreSQL: the project doesn't need concurrent writes; embedded is simpler
- Python 3.12+: uses new type syntax
Day 3: Hitting a Pitfall
You ran into an issue with DuckDB’s date handling — the default timezone is UTC, but your users are in China. After the AI helped you solve it, it wrote a memory:
# DuckDB Timezone Issue
Keywords: `DuckDB` `timezone` `date` `UTC` `Asia/Shanghai`
## Problem and Solution
DuckDB uses UTC timezone by default. Running `SELECT NOW()` returns UTC time.
Solution: Set `SET timezone = 'Asia/Shanghai'` when connecting.
Note: Don't do timezone conversion at the SQL level — handle it in the Python layer with datetime for reliability.
Day 7: New Session, No Repetition Needed
You start a new session and say “add a CSV export feature to the analysis API.” The AI already knows:
- The project uses FastAPI + DuckDB
- Timezone is Asia/Shanghai
- Database queries are handled in the Python layer
You don’t need to explain it all again.
✨ This is the value of memory: it saves the 30 minutes you’d otherwise spend re-explaining every time.
Best Practices
✅ What to Remember
- Technical decisions: Why you chose A over B
- Pitfalls encountered: The problem and the solution
- Project conventions: Naming conventions, directory structure, deployment method
- Architecture knowledge: Module relationships, data flow, key interfaces
❌ What Not to Remember
- Temporary debugging info (“this variable’s value is 42”)
- Outdated conclusions (remember to clean up periodically)
- General programming knowledge (the AI already knows how to write a for loop)
Memory File Management
More memory isn’t always better. The system has multi-layer protection:
| Threshold | Behavior |
|---|---|
| 20 files | Prompt to watch the count |
| 25 files | Warning: approaching limit, suggest cleanup/merge |
| 40 files | Hard rejection on writes — must clean up first |
Cleanup methods:
- Merge: Combine multiple files on the same topic into one
- Archive: Outdated conclusions replaced by new ones — delete the old
- Split: When a single file exceeds 200 lines, split by subtopic
Additionally, the memory system has conflict detection — if a new file has the same topic or overlaps on 3+ keywords with an existing memory, the write is rejected outright, forcing the conflict to be resolved first (merge or overwrite).
Configuration
Install pi-memory via pi’s settings.json:
{
"packages": [
"pi-memory"
]
}
No additional configuration needed — ready to use on install. Memory files are stored in the project’s .pi/memory/ directory.
Memory Scope
| Scope | Path | Usage |
|---|---|---|
| Project-level | .pi/memory/ | Project-specific knowledge (architecture, decisions, pitfalls) |
| Global | ~/.pi/agent/memory/ | Cross-project general knowledge (toolchain, coding discipline) |
Advanced Scenarios: Memory Cleanup and Knowledge Evolution
Scenario: Memory Fragmentation
After a month of use, memory files pile up:
.pi/memory/
├── database-choice--DuckDB,embedded,ARM.md
├── db-choice--database,DuckDB,performance.md ← Duplicate of above!
├── deployment-issue--Docker,ARM,memory.md
├── deployment-issue2--Docker,memory,OOM.md ← Also duplicate!
├── auth-bug--JWT,expiry,refresh.md
├── fastapi-cors--CORS,FastAPI,cross-origin.md
├── test-tricks--vitest,mock,testing.md
├── ... (20 more files)
Before each write, the AI automatically runs memory_index to check. If it finds existing memories on the same topic, it merges first before writing. But if fragmentation has already happened, manual cleanup is needed.
Cleanup steps:
- Ask the AI to run
memory_indexto view all current memories - Mark files on the same topic (3+ overlapping keywords)
- Ask the AI to read the marked files and merge them into one
- Use
memory_updateto overwrite an existing filename (not a new one, or conflict detection will reject the write) - Delete the old fragmented files
Scenario: Old Conclusions Overturned
A memory written last month says “use Express.js for the backend,” but this month the project decided to migrate to FastAPI. Every time the AI reads the old memory, it thinks in Express terms and gives bad suggestions.
Solution: Use memory_update to overwrite the old file with the new conclusion:
# Backend Framework Decision
Keywords: `FastAPI` `Python` `migration`
## Current Decision
2026-05: Migrated from Express.js to FastAPI.
## Migration Reasons
- Needed better async support
- Python's data analysis ecosystem is richer
- Express.js version has been archived and is no longer updated
> ⚠️ Old conclusion (deprecated): Use Express.js + TypeScript
Key point: Don’t just delete the old file — clearly document both the new conclusion and how it relates to the old one, so the AI doesn’t “reinvent” the old approach in other contexts.
Scenario: Cross-Project Knowledge Reuse
You hit a pitfall in project A and want to make sure project B doesn’t repeat the same mistake.
Solution: Write the general knowledge to global memory:
~/.pi/agent/memory/
└── npm-file-ref-traps--npm,file-ref,node_modules,cache.md
Global memory is visible to all projects. This way, no matter which project you’re in, the AI will know “npm file: references have cache traps.”
Principle:
- Project-specific knowledge (this project’s architecture, conventions) → project-level
.pi/memory/ - General experience (toolchain pitfalls, coding discipline) → global
~/.pi/agent/memory/
Next Steps
Now the AI has memory and can retain knowledge across sessions. But when faced with a large project, does it know what to do first, and what to do next?
In the next chapter, we’ll look at how to teach the AI planning.
From Memory to Planning
You’ve Probably Encountered This
You ask an AI to do a large task like “migrate the project from JavaScript to TypeScript.”
Everything goes well for the first 30 minutes — the AI migrates configuration files, type definitions, and core modules as planned. But by the 5th file, the AI starts to “drift”:
- It forgets earlier conventions and starts using a different naming style
- It skips migrating test files
- It begins a “refactor while you’re at it” that you never asked for
- When you remind it to get back on track, it can’t remember what the first 3 steps of the original plan were
💡 Goldfish memory is solved, but there’s still an “attention deficit” problem: The AI can remember knowledge, but it can’t manage tasks.
Roadmap: Teaching AI to Manage Complex Tasks
pi-roadmap gives the AI a “project management brain”:
- Structured breakdown: Decomposes large goals into Epic → Story → Task three layers
- Progress tracking: Each task has a clear status (todo / doing / done / blocked)
- Persistence: Roadmaps are saved in files, so new sessions can continue where you left off
- Priority sorting: Automatically recommends what to do next
Why a Three-Layer Structure?
Epic (Big direction)
└── Story (Deliverable work chunk)
└── Task (Smallest executable unit)
This structure comes from agile development practices, with some adaptations:
| Concept | Traditional Agile | pi-roadmap | Rationale |
|---|---|---|---|
| Epic | 2-8 weeks | A complete project direction | AI sessions don’t span weeks |
| Story | 1-3 days | Completable in 1-3 sessions | Adapted to AI’s working rhythm |
| Task | 0.5-1 day | 30 min - 2 hours | Granularity AI can focus on at once |
How It Works
User describes goal
│
▼
┌──────────────────────────────────┐
│ roadmap_plan │
│ AI analyzes goal → breaks into │
│ three-layer structure │
│ compares with existing roadmap │
│ → incremental update │
└──────────────┬───────────────────┘
│
▼
┌──────────────────────────────────┐
│ ~/.pi/roadmap/<id>.roadmap.json │
│ Global storage, cross-session │
│ and cross-project access │
│ + Project-level │
│ .pi/roadmap/roadmap.json │
│ (auto-synced derivation) │
└──────────────┬───────────────────┘
│
┌─────────┼─────────┐
▼ ▼ ▼
roadmap_list roadmap_show roadmap_next
List all Show detail Recommend next
Real-World Example: Migrating a Multi-Package Project
Let’s look at a real scenario — upgrading documentation across 12 npm packages simultaneously:
Step 1: Create a Roadmap
You say: “Help me plan documentation work for all packages.”
The AI calls roadmap_plan, which automatically decomposes:
{
"roadmapId": "package-docs",
"title": "Package Documentation Upgrade",
"epics": [
{
"id": "E0",
"title": "Define template and validate",
"stories": [
{
"id": "E0.S0",
"title": "Analyze best practices, distill template",
"tasks": [
{ "id": "E0.S0.T0", "title": "Research GitHub documentation standards" },
{ "id": "E0.S0.T1", "title": "Distill README template" },
{ "id": "E0.S0.T2", "title": "Validate template with first package" }
]
}
]
},
{
"id": "E1",
"title": "Batch upgrade all packages",
"stories": [
{ "id": "E1.S0", "title": "Core extensions (4 packages)" },
{ "id": "E1.S1", "title": "Tool extensions (4 packages)" },
{ "id": "E1.S2", "title": "Utility extensions (4 packages)" }
]
}
]
}
Step 2: Advance According to Plan
In each new session, you say “continue.” The AI calls roadmap_next:
📊 Recommended next task:
E0.S0.T1 — Distill README template (high priority)
Part of: E0 Define template and validate > S0 Analyze best practices
Start?
Step 3: Mark Completion
After completing a task, the AI calls roadmap_done:
✅ E0.S0.T1 Completed
Output: templates/README-template.md
Step 4: Encountering Blockers
If source code for a package is missing, the AI can mark the task as blocked:
⚠️ E1.S1.T3 Blocked
Reason: pi-journal's API documentation is incomplete; source code comments need to be added first
Roadmap vs Memory: What’s the Relationship?
| Dimension | Memory (pi-memory) | Roadmap (pi-roadmap) |
|---|---|---|
| What it stores | Knowledge, decisions, pitfalls | Tasks, progress, plans |
| Granularity | Free-form text | Structured JSON |
| Query method | Keywords | Status / priority |
| Lifecycle | Long-term retention | Can be archived when project ends |
Simply put:
- Memory is “what I know”
- Roadmap is “what I need to do and how far along I am”
The two complement each other: memory helps the AI remember knowledge, the roadmap helps the AI remember tasks.
Best Practices
✅ Good Epic Breakdown
Epic: Publish npm package
Story: Prepare release environment
Task: Configure package.json exports field
Task: Add bundledDependencies configuration
Task: Configure tsconfig declaration output
Story: Write documentation
Task: Complete README.md
Task: Add CHANGELOG.md
❌ Bad Breakdown
Epic: Do everything ← Too vague, no direction
Story: Do the first step ← Doesn't say what to do
Task: Start working ← Not actionable
Golden Rules of Breakdown
- Epic titles should be verb phrases: “Publish npm package” instead of “npm”
- Stories should have clear deliverables: “Complete README” instead of “Write docs”
- Tasks should be executable within 30 minutes: “Configure package.json name field” instead of “Configure build”
- Items at the same level should have the same granularity: Don’t have one Story with 2 tasks and another with 20
Advanced Scenarios: Plan Adjustment & Progress Tracking
Scenario: When Direction Needs to Change
Plans change. The roadmap you laid out yesterday may no longer fit today’s requirements. You don’t need to start over — just update with roadmap_plan:
You: Yesterday's refactoring plan is too big. I want to start with just the auth module.
AI calls roadmap_plan(action="update"):
→ Compares current roadmap with your new requirements
→ Keeps completed tasks untouched
→ Marks unnecessary tasks as dropped
→ Adds new tasks
Key principle: roadmap_plan is incremental, not overwriting. Tasks already marked done are never rolled back.
Scenario: Tracking Who Did What
In multi-session collaboration, you often wonder “which session completed this task?” The roadmap tracks this automatically:
roadmap_show(roadmapId="package-docs")
Result:
E0.S0.T0 Research GitHub documentation standards ✅ by: 8740-8fce3e7af232
E0.S0.T1 Distill README template ✅ by: b8b5-85516ead6253
E0.S0.T2 Validate template with first package ✅ by: b8b5-85516ead6253
E1.S0.T0 Core extension - pi-shepherd 🔄 doing by: aa55-a4860e851afb
The by: xxxx-xxxxxxxxxxxx suffix after each completed task is the short form of the session ID (last two segments of the UUID). You can use this ID to search for the specific session:
session_search(action="grep", query="8740-8fce3e7af232")
→ Find the session, then use session_analyze(action="summary") to view details
Scenario: Archiving Completed Epics
When a project is finished, you don’t want completed Epics cluttering your view:
roadmap_archive(roadmapId="package-docs")
→ Auto-archives all completed Epics
→ Hidden by default, view with show_archived=true
Scenario: Not Sure What to Do Next
When you open pi and have no idea what to continue:
roadmap_next()
Result:
📊 Recommended next task (sorted by priority):
1. E1.S0.T3 — Configure package.json files whitelist (high, todo)
2. E1.S1.T0 — Tool extension - pi-roadmap (medium, todo)
3. E2.S0.T0 — Research mdBook theme customization (low, todo)
roadmap_next automatically sorts by doing → todo, high → medium → low, telling you exactly what deserves your attention.
Searching Roadmaps: roadmap_search
When you have many roadmaps, roadmap_search lets you quickly find tasks by keyword, searching across title + description + note:
You: Have we planned anything related to CI?
AI calls roadmap_search(query="CI")
→ Found matches across roadmaps, with full hierarchical context
Task Dependencies: dependsOn
Complex projects often have sequential dependencies — B can only start after A is done. dependsOn lets you declare dependencies on Stories and Tasks:
{
"title": "Configure CI pipeline",
"status": "todo",
"dependsOn": ["E1.S1.T3"]
}
roadmap_show automatically displays dependency relationships:
E1.S2 Publish workflow
E1.S2.T1 Configure CI pipeline 📋 depends: [E1.S1.T3]
E1.S2.T2 Publish to npm registry 📋 depends: [E1.S2.T1]
📖 For adding and updating dependencies, see pi-roadmap README.
Protection Mechanisms
- Cannot add child items to archived/completed Epics/Stories — prevents appending tasks to already completed work
- Duplicate ID checking —
roadmap_updateverifies that dependsOn IDs actually exist
Next Steps
The AI now has memory (to remember knowledge) and a roadmap (to manage tasks). But sometimes, the AI still “makes mistakes” — modifying files it shouldn’t, using approaches it shouldn’t.
In the next chapter, we’ll look at how to set rules for the AI.
Setting Rules for AI
You’ve Probably Seen This Before
You ask the AI to “fix the login page styles.” 30 seconds later you check the code —
The AI didn’t just fix the styles. It also:
- “Conveniently” refactored the entire login component directory structure
- Switched CSS modules to Tailwind (your project doesn’t use Tailwind)
- Deleted 3 test files it deemed “unnecessary”
- Upgraded all dependencies in package.json to the latest versions
By the time you notice, the code has already been committed.
💡 The more capable the AI, the more it needs rules. Without boundaries, greater capability only causes greater damage.
Two Lines of Defense: Shepherd and Context
pi-atelier provides two layers of protection:
First Line: pi-shepherd — The Behavior Guard
Shepherd is a rule-driven event hook engine that checks AI actions before and after key moments — think of it as a security guard.
AI about to execute an action (tool call)
│
▼
┌──────────────────────────────────┐
│ Shepherd tool_call hook │
│ Check: Should it be done? │
│ How should it be done? │
└──────┬───────────────────────────┘
│
┌───┴────┐
│ │
Allow Rewrite/Block + Show Reason
... tool executes ...
┌──────────────────────────────────┐
│ Shepherd tool_result hook │
│ Check: Any follow-up needed? │
└──────┬───────────────────────────┘
│
Inject reminder / Append action
Supported hook timings:
| Hook | When It Triggers | Typical Use Case |
|---|---|---|
tool_call | Before AI calls a tool | Rewrite commands, block dangerous operations |
tool_result | After tool execution | Auto-remind to run tests, lint checks |
message_end | After AI finishes replying | Match AI response text, intercept wrong guesses |
agent_end | When AI finishes a conversation | Remind to commit code, update memory |
session_shutdown | When a session closes | Clean up temporary data |
Shepherd’s four actions:
| Action | Effect | Typical Use Case |
|---|---|---|
block | Prevents tool execution | Block dangerous operations |
notify | Injects a reminder into AI context | “You edited a TS file, remember to run tests” |
steer | Silently injects guidance (not visible to user) | Guide the AI to consult documentation |
rewrite | Modifies tool call parameters | Auto-prepend prefix to commands |
Second Line: pi-context-manager — Information Quality & Diagnostics
Context Manager controls what information the AI sees, and also helps you diagnose token consumption issues.
Core capabilities:
- Distill: Automatically compresses large tool outputs, preserving key information
- Tool Result Processor: Formats and simplifies output from specific tools
- Aging: Automatically evicts old tool outputs that haven’t been referenced in a while
- Payload Analysis: Diagnoses where tokens are being spent with data
Tool returns large output (potentially 50KB)
│
▼
┌────────────────────────┐
│ Context Manager │
│ Distill + Processor │
│ Compress to ~5KB │
│ key information │
└────────────────────────┘
│
▼
AI sees refined information and makes better decisions
For detailed principles, see 3.3 Context Manager Deep Dive.
Real-World Examples: Preventing AI Mistakes
Scenario 1: Auto-Remind to Run Tests After Edit
{
"comment": "[TypeScript] Must run tests after editing",
"hook": "tool_result",
"tool": "edit",
"action": "notify",
"conditions": [
{ "field": "path", "pattern": "\\.ts$", "flags": "" }
],
"reason": "Edited a TypeScript file. You must run unit tests covering this code (add tests if none exist) and fix all test issues to ensure they pass.",
"enabled": true
}
When the AI edits a .ts file, Shepherd automatically reminds the AI to run tests.
Scenario 2: Session-End Reminder to Commit Code
{
"comment": "[Wrap-up] Remind to commit + update memory + summary after edits",
"hook": "agent_end",
"action": "notify",
"conditions": [{ "builtin": "has_edits" }],
"reason": "Detected file edits. Perform wrap-up:\n1️⃣ Git commit...\n2️⃣ Update memory...\n3️⃣ Session summary",
"stopReason": ["stop"],
"enabled": true
}
conditions: [{ builtin: "has_edits" }] means it only triggers when the session actually edited files. stopReason: ["stop"] means it only triggers when the AI ends normally (not when interrupted).
Scenario 3: Auto-Rewrite Commands
{
"comment": "[rtk] Auto-proxy frequent bash commands",
"tool": "bash",
"action": "rewrite",
"pattern": "^(git\\s+(status|log|diff)|cargo\\s+(test|build|clippy)|pytest)\\b",
"flags": "",
"reason": "rtk command rewrite: auto prepend rtk prefix to compress output",
"enabled": true
}
When the AI tries to run commands like git status, Shepherd automatically rewrites it as rtk git status (rtk is an output compression tool).
Scenario 4: Code Style Check
{
"comment": "[TS] No space indentation - TS files must use Tab",
"hook": "tool_call",
"tool": "edit",
"action": "notify",
"conditions": [
{ "field": "path", "pattern": "\\.ts$", "flags": "" },
{ "field": "text", "pattern": "\\n [\\S ]", "flags": "" }
],
"reason": "❌ TS files require Tab indentation, not spaces. Please rewrite the code using Tab indentation.",
"enabled": true
}
Both conditions must be met to trigger: the file is .ts and the code contains space indentation.
Scenario 5: Remind to Check Memory After Repeated Errors
{
"comment": "[debug] Remind to check memory when tools repeatedly fail",
"hook": "tool_result",
"action": "steer",
"state": { "countKind": "errors", "gte": 5 },
"reason": "🔍 **Tools repeatedly failing**: Multiple consecutive failures. Check memory files under .pi/memory/ to see if there are existing records of this pitfall.",
"enabled": true,
"subagent": false
}
state implements state tracking — Shepherd remembers the error count and only triggers when it reaches the threshold. subagent: false means this rule does not trigger in sub-agents.
Shepherd Rule Configuration Reference
Rule File Locations
| Level | Path | Description |
|---|---|---|
| Global default | rules.json inside the extension package | Built-in rule set for Shepherd |
| Global custom | ~/.pi/agent/extensions/shepherd/rules.json | Global custom rules (managed via the shepherd_rules tool) |
| Project-level | .pi/shepherd-rules-*.json (project root) | Custom project rules, can create multiple files |
Rule file changes take effect immediately — no action needed.
Shepherd Rule Management
Rules can be managed in two ways:
shepherd_rulestool: Have AI safely add, update, or delete rules — includes write validation and rollback- Direct JSON editing: Global rules at
~/.pi/agent/extensions/shepherd/rules.json, project rules at.pi/extensions/shepherd-rules.json
Rule files take effect immediately after modification, no restart needed.
📖 For complete field reference and parameter docs, see pi-shepherd README.
Typical usage:
# Ask AI to add a rule for you
You: Add a rule to run cargo clippy after editing .rs files
AI calls shepherd_rules(action="add", rule={...})
# View project-level rules
AI calls shepherd_rules(action="list", scope="project")
Configuring Context
pi-context-manager provides the following commands:
| Command | Purpose |
|---|---|
/record [on|off] | Toggle payload recording |
/context | TUI panel: visualize context usage |
/distill-config [N] | View/set distill token threshold |
/distill-config --cap [N] | View/set first-seen full text cap (firstSeenCap, 0 = no cap) |
/processor-config [N|off] | View/set tool-result-processor threshold |
/aging-config [N|off] | View/set aging eviction rounds |
/context-clean [sessionId] | Clean up persistent data |
💡 All commands show current config and usage when called without arguments. For example, entering
/distill-configdirectly displays the current threshold and usage instructions.
For detailed principles, see 3.3 Context Manager Deep Dive.
Best Practices
✅ Good Rule Design
- Precise conditions: Use
conditionsto narrow the trigger scope — don’t use a sledgehammer - Clear messaging: Tell the AI “why it’s not allowed” and “what to do instead”
- Layered protection: Use
block(enforced) for important matters,notify(advisory) for minor ones,steer(silent) for internal guidance - Make use of state tracking: Reminding after 3 consecutive errors is more effective than reminding every single time
❌ Bad Rule Design
- Too frequent:
notifyon every tool call — the AI would be flooded with reminders - Too draconian:
denyallbashcommands — the AI can’t even runls - Vague messaging:
"reason": "Caution"— caution about what? - Ignoring sub-agents: Some rules should use
"subagent": falseto exclude sub-agent scenarios and avoid interfering with independent tasks
Rule Priority
When multiple rules match simultaneously:
block>notify>steer(block > remind > silent guidance)- At the same priority, rules execute in definition order within the rule file
- In the
agent_endhook, rules whosecheckcondition is not met are skipped
Next Up
With memory, planning, and rules in place, the AI is already a reliable assistant. But after a session accomplishes many things — how do you know exactly what it did? Which files were changed? What decisions were made?
In the next chapter, we’ll look at how to teach the AI to review its own work.
3.2 How pi-shepherd Works: A Rule-Driven Hook System
Shepherd is the “nervous system” of pi-atelier — it doesn’t provide tools or commands directly, but connects all other extensions through event hooks.
Architecture Overview
pi event bus
│
├─ before_provider_request ← Shepherd injects ephemeral hints here
│
├─ tool_call ← Shepherd intercepts/rewrites tool calls
│ │
│ ▼
│ Tool executes
│ │
│ ▼
├─ tool_result ← Shepherd checks results, triggers follow-up actions
│
├─ agent_end ← Shepherd triggers wrap-up actions
│
└─ session_shutdown ← Shepherd cleans up ephemeral state
Core Concepts
Rule
Each rule is a JSON object that defines “when to trigger, under what conditions, and what action to take”:
Rule = Hook timing(hook) + Match conditions(conditions/pattern) + Action(action) + Prompt(reason)
Action Types in Detail
| Action | Injection Method | User Visible | Typical Use Case |
|---|---|---|---|
notify | Injects into AI context | ✅ Yes | Remind AI to run tests, lint |
steer | Silent injection | ❌ No | Guide AI to consult documentation |
rewrite | Modifies tool parameters | ✅ Yes | Auto-prepend prefix to commands |
block | Prevents execution | ✅ Yes | Block dangerous operations |
State Tracking
Shepherd maintains internal state counters for tool calls:
"state": { "countKind": "errors", "gte": 5 }
This means “trigger when cumulative errors ≥ 5 times.” countKind supports:
"errors": Counts when a tool returns an error"calls": Counts when a tool is called
message_end Hook
message_end is a special hook timing: it triggers after the AI finishes its reply, matching the AI’s output text rather than tool parameters. This lets Shepherd “listen” to what the AI says and inject corrections when it spots problems.
AI reply completed
│
▼
┌──────────────────────────────────────┐
│ Shepherd message_end hook │
│ Regex match on AI reply text │
└──────┬───────────────────────────────┘
│
Matched?
├── Yes → steer: silently inject correction (next turn)
└── No → no-op
Difference from tool_call/tool_result:
| Dimension | tool_call / tool_result | message_end |
|---|---|---|
| Match target | Tool parameters (commands, file paths) | AI’s reply text |
conditions[].field | path or text (tool params) | text (AI reply) |
| Typical actions | block/notify/rewrite/steer | Usually steer only |
| Typical use | Block dangerous commands, post-edit reminders | Intercept wrong guesses, guide corrections |
Cross-Extension Communication
Shepherd receives “hints” from other extensions via the pi.events event bus:
Other extension emits hint → pi.events.emit("ephemeral:hint") → Shepherd collects
│
At before_provider_request → Shepherd injects collected hints into AI context
This mechanism allows extensions to collaborate without direct dependencies on each other.
Rule Loading Flow
1. Load rules.json inside the extension package (global default rules)
│
▼
2. Scan project directory for .pi/shepherd-rules-*.json (project rules)
│
▼
3. Rules stack and take effect (project rules override global rules with the same name)
Rule file changes take effect immediately — no action needed.
Editing Rules with shepherd_rules
The shepherd_rules tool provides safe rule editing with built-in validation:
# List all rules (global + project merged)
shepherd_rules(action="list")
# Add a global rule
shepherd_rules(action="add", rule={...})
# Add a project-level rule
shepherd_rules(action="add", scope="project", rule={...})
# Update rule #2 in global rules
shepherd_rules(action="update", index=2, changes={"action": "block"})
# Delete rule #0 in project rules
shepherd_rules(action="delete", scope="project", index=0)
scope parameter:
global(default): Operate on~/.pi/agent/extensions/shepherd/rules.jsonproject: Operate on<cwd>/.pi/extensions/shepherd-rules.jsonlistwithoutscope: Returns merged view from both levels
Safety features:
- Validates required fields and regex patterns before writing
- Reads back after write to verify
- Auto-restore from backup on failure
- Same-signature rules (tool + hook + pattern/check + action) auto-overwrite instead of appending
Configuration
Shepherd supports three-layer config merging (defaults → global settings → project settings). Override in .pi/settings.json as needed.
📖 For complete configuration parameters, see pi-shepherd README.
Next Up
Now that you understand how Shepherd guards AI behavior, the next section covers how Context Manager controls information quality.
3.3 pi-context-manager: Information Quality Control & Token Diagnostics
pi-context-manager is the merger of the original pi-context and pi-payload-analyzer, providing unified management of context quality and token diagnostics.
Three Core Capabilities
1. Distill: Compressing the Flood of Tool Output
When a tool returns a large amount of content (e.g., reading a 1000-line file), pi-context-manager automatically compresses it, keeping only essential information:
Raw tool output (50KB)
│
▼
┌────────────────────────┐
│ Distill Processor │
│ Extract key lines + │
│ summary │
└────────────────────────┘
│
▼
Compressed output (~5KB)
│
▼
AI sees refined information
Distill is enabled by default. Two key parameters:
| Config | Command | Description |
|---|---|---|
distillThreshold | /distill-config | Tool outputs exceeding this token count will be compressed |
firstSeenCap | /distill-config --cap | Maximum token cap for first-encountered tool output (0 = no limit) |
💡 Purpose of firstSeenCap: Some tools return massive results on first use (e.g.,
lslisting a large directory), but you don’t need all of it.firstSeenCaplimits the initial output size; subsequent requests may further compress the result through distill.
2. Tool Result Processor: Smart Formatting & Trimming
The Tool Result Processor performs structured trimming on specific tool outputs, more precise than distill:
- Code Graph output trimming: Auto-compresses AST search results, preserving only key signatures and locations
- MCP JSON output trimming: Compresses verbose JSON returned by MCP tools
- Error output trimming: Truncates overly long error stack traces
- Web search output trimming: Keeps only key information from search results
Use the /processor-config command to view or adjust processing thresholds.
3. Aging: Phasing Out Stale Content
In long sessions, early tool outputs may no longer be relevant. The Aging mechanism automatically phases out “outdated” content:
Round 1: Tool Output A (fresh 🟢)
Round 5: Tool Output A (a bit old 🟡)
Round 10: Tool Output A (too old 🔴 → auto-deleted)
Aging Smart Exemptions: Certain content types are protected from aging:
- Skill files (SKILL.md) content
- User-flagged content
- Content most recently referenced by the AI
Use /aging-config to set the eviction round count, or /aging-config off to disable.
4. Payload Analysis: Diagnosing Context Issues with Data
Is the AI getting dumber as the session grows long? Use payload_analyze to find out.
⚠️ Important:
payload_analyzeis an AI tool, not a terminal command. You ask the AI to run it in your pi chat. For example:Help me check the current token usage with payload_analyzeOr more precisely:
Run payload_analyze action="budget"
| Analysis Mode | How to Ask the AI | What It Shows |
|---|---|---|
budget | “Analyze token budget distribution” | Token ratio of system/tools/history sections |
growth | “Show token growth trend” | How tokens expand over the course of a session |
expensive | “Find the most token-hungry tool calls” | Top N most expensive tool calls |
overview | “Detailed payload analysis” | Per-message token breakdown |
messages | “View message #5” | Pinpoint messages by index/range/keywords |
chain | “Trace this tool call” | Track a single tool call across payloads |
diff | “Compare two payloads” | Find differences between two requests |
stats | “Show distill/processor hit rate” | Aggregate compression efficiency statistics |
💡 Start with budget, then dive deeper: When facing context issues, first use
budgetfor an overview, thenexpensiveto pinpoint the heavy hitters, and finallymessagesto examine a specific message.
/context TUI Panel
pi-context-manager also provides a TUI (Terminal User Interface) panel for visually browsing context content:
/context command
│
▼
┌─────────────────────────────────────┐
│ 📊 Context Panel │
│ │
│ [Categories] [Tool Details] │
│ [Mark for Deletion] │
│ │
│ ├─ System Prompt 4.2K tokens │
│ ├─ Tool Definitions 8.1K tokens │
│ ├─ Memory 2.3K tokens │
│ ├─ History 52K tokens │
│ │ ├─ Rounds 1-10 (marked delete) │
│ │ ├─ Rounds 11-20 │
│ │ └─ Rounds 21-30 │
│ └─ Tool Results 64K tokens │
│ ├─ read(schema.ts) 8.2K 🔴 │
│ └─ grep("TODO") 4.1K 🟡 │
└─────────────────────────────────────┘
In the panel you can:
- Browse by category: View context content by type
- Tool details: See full content returned by each tool
- Mark for deletion: Manually flag unwanted content for exclusion in the next request
Complete Command Reference
| Command | Purpose | Behavior without args |
|---|---|---|
/record [on|off] | Toggle payload recording | Toggle on/off |
/context | Open TUI visualization panel | — |
/distill-config [N] | View/set distill threshold | Show current config + usage |
/distill-config --cap [N] | View/set firstSeenCap | Show current config + usage |
/processor-config [N|off] | View/set processor threshold | Show current config + usage |
/aging-config [N|off] | View/set aging round count | Show current config + usage |
/context-clean [sessionId] | Clean persistent data | Clean all data |
Best Practices
| Issue You’re Facing | First Step | Next Step | Solution |
|---|---|---|---|
| AI gets dumber after 30 rounds | payload_analyze(action="growth") | Check which phase tokens spike | Lower distill threshold / install smart compact |
| AI ignores certain file content | Check distill config | May be over-compressed by distill | Adjust distillThreshold |
| Every tool call is painfully slow | payload_analyze(action="expensive") | Find the most expensive calls | Limit large file reads or split files |
| Old tool outputs consume space | Run /aging-config | Set appropriate eviction rounds | Aging auto-evicition + manual /context panel cleanup |
| First tool output is too large | Set /distill-config --cap | Limit initial full-text output | firstSeenCap limits first output size |
Next Steps
In the next chapter, we’ll explore how to teach the AI to review — automatically record session events and revisit history at any time.
Original: /home/lain/.pi/agent/distill/processor/read-b63ebc90-1779883939893.txt
3.5 Shepherd in Practice: Real-World Scenarios
This section demonstrates how to use Shepherd rules to solve common problems in AI-assisted coding through real-world scenarios.
Scenario 1: Auto-Prompt for Running Tests After Code Edits
Problem
The AI modifies TypeScript code but forgets to run tests. You have to manually say “run the tests” every time.
Rule
{
"comment": "[TypeScript] Must run tests after edits",
"hook": "tool_result",
"tool": "edit",
"action": "notify",
"conditions": [
{ "field": "path", "pattern": "\\.ts$", "flags": "" }
],
"reason": "You edited a TypeScript file. You must run unit tests covering the code (add tests first if none exist) and fix all issues to ensure they pass.",
"enabled": true
}
Effect
AI: I modified the null-check logic in src/auth/login.ts.
🛡️ Shepherd reminds: You edited a TypeScript file. You must run unit tests covering the code.
AI: Got it, let me run the tests... ✅ All 3 tests pass.
Scenario 2: Preventing the AI from Messing with Others’ Code
Problem
You’re working in a team project. A colleague has uncommitted changes in the workspace. The AI sees “something wrong here” and casually runs git checkout to restore their files.
Rule
{
"comment": "[Safety] Block git checkout -- to restore files",
"hook": "tool_call",
"tool": "bash",
"action": "block",
"conditions": [
{ "field": "text", "pattern": "git\\s+checkout\\s+--", "flags": "" }
],
"reason": "❌ Blocked: git checkout -- to restore files! There are uncommitted changes from others in the workspace — you don't have the authority to decide which changes are 'unrelated'.",
"enabled": true
}
Effect
AI prepares to run: git checkout -- src/config.ts
🛡️ Shepherd blocks: git checkout -- to restore files is not allowed!
AI: Sorry, I won't restore other people's files. Let me find another approach...
Scenario 3: Auto-Commit Code at Session End
Problem
The AI modified a bunch of files, the session ends, but the code isn’t committed. The next day, the workspace is a mess.
Rule
{
"comment": "[Wrap-up] Prompt for commit + memory update + summary after edits",
"hook": "agent_end",
"action": "notify",
"conditions": [{ "builtin": "has_edits" }],
"reason": "File edits detected. Perform wrap-up tasks:\n1️⃣ Git commit...\n2️⃣ Update memories...\n3️⃣ Session summary",
"stopReason": ["stop"],
"enabled": true
}
check: "has_edits" ensures the notification only triggers when files were actually modified, avoiding interference in pure chat sessions. stopReason: ["stop"] ensures it only fires on normal termination, not interruptions.
Scenario 4: Auto-Prompt for Architecture Check After .gd File Edits
Problem
You’re working on a Godot game project. After the AI edits .gd files, it should run architecture checks and formatting checks — but you have to remind it manually every time.
Rule (multiple rules can be defined for the same file, executed in order)
{
"comment": "[arch] Prompt for compilation validation after .gd edits",
"hook": "tool_result",
"tool": "edit",
"action": "notify",
"conditions": [
{ "field": "path", "pattern": "\\.gd$", "flags": "" }
],
"reason": "You edited a .gd file. Please run check_arch to verify architecture compliance.",
"enabled": true
},
{
"comment": "[format] Prompt for formatting check after .gd edits",
"hook": "tool_result",
"tool": "edit",
"action": "notify",
"conditions": [
{ "field": "path", "pattern": "\\.gd$", "flags": "" }
],
"reason": "You edited a .gd file. Please run gdformat for formatting checks.",
"enabled": true
}
Both rules will fire, and the AI will run the architecture check followed by the formatting check.
Scenario 5: Auto-Prompt to Check Memory on Repeated Tool Errors
Problem
The AI keeps hitting errors — edit match failures, missing bash commands, tests failing repeatedly. It’s circling in the same dead end.
Rule
{
"comment": "[debug] Prompt to check memory on repeated tool errors",
"hook": "tool_result",
"action": "steer",
"state": { "countKind": "errors", "gte": 5 },
"reason": "🔍 **Repeated tool errors**: Failed multiple times consecutively. Check the memory files under .pi/memory/ directory for existing troubleshooting records.",
"enabled": true,
"subagent": false
}
Key points:
state: { "countKind": "errors", "gte": 5 }— only triggers after 5 consecutive errors, won’t bother you every timeaction: "steer"— silently injects guidance, invisible to the user interfacesubagent: false— won’t fire in sub-agents, avoiding interference with independent tasks
Effect
AI tries edit, fails...
AI tries edit, fails...
AI tries bash sed, fails...
AI tries edit, fails...
AI tries edit, fails...
🛡️ Shepherd silently guides: Check the memory files.
AI: Let me check the memories... Found it! The memory file says "when edit match fails, first check for CRLF".
AI: Running audit_format.py to check format... It is indeed a CRLF issue.
Scenario 6: Auto-Rewriting High-Frequency Commands
Problem
The AI frequently runs commands like git status, git log, npm test, etc. Their output can be lengthy, wasting tokens.
Rule
{
"comment": "[rtk] Auto-proxy frequent bash commands",
"tool": "bash",
"action": "rewrite",
"pattern": "^(git\\s+(status|log|diff)|cargo\\s+(test|build|clippy)|pytest)\\b",
"flags": "",
"reason": "rtk command rewrite: auto-prepend rtk prefix to compress output",
"enabled": true
}
When the AI runs git status, Shepherd automatically rewrites it to rtk git status (rtk is an output compression tool), reducing token consumption. The AI doesn’t need to know about this rewrite — to it, the result just looks cleaner.
Scenario 7: Intercept AI’s Wrong Attribution Guesses (message_end)
Problem: When encountering errors, the AI sometimes attributes failures to toolchain issues (“jiti cache problem”, “vitest proxy issue”) instead of its own code bugs. This wastes debugging time and misleads the investigation.
Shepherd Solution: Use message_end hook to match patterns in AI’s reply and inject corrections.
{
"comment": "[message_end] Intercept wrong toolchain blame",
"hook": "message_end",
"action": "steer",
"conditions": [{
"field": "text",
"operator": "matches",
"value": "jiti.*缓存|缓存.*jiti|vitest.*proxy"
}],
"reason": "Don't blame toolchain — check your own code first. 90% of the time it's a code bug, not a cache/proxy/module resolution issue.",
"enabled": true
}
When the AI mentions “jiti cache” or “vitest proxy” in its reply, Shepherd silently injects guidance to redirect it toward checking its own code.
Rule Design Pattern Summary
| Pattern | Action | Use Case |
|---|---|---|
| Post-edit reminder | notify + conditions | Run tests, lint, formatting after code changes |
| Dangerous operation block | block + conditions | Block git checkout --, prevent file deletion |
| Session wrap-up automation | agent_end + check | Auto-commit + memory update at session end |
| Repeated error guidance | steer + state | Guide the AI to check memories when it keeps failing |
| Wrong guess interception | message_end + steer | Intercept AI’s wrong attribution guesses |
| Command rewriting | rewrite + pattern | Auto-prepend prefix to compress command output |
📖 Return to 3.1 Setting Rules for AI for the complete rule field reference.
Original: /home/lain/.pi/agent/distill/processor/read-b63ebc90-1779883939894.txt
Teaching the AI to Review
You’ve Probably Experienced This
You had a 3-hour session where the AI helped you do a lot:
- Fixed two bugs
- Refactored a module
- Set up the CI pipeline
- Wrote a bunch of tests
The next day you want to review: “How exactly did I fix that login bug yesterday?” But all you have is a vague memory — was it auth.ts or middleware.ts? Did you add a null check or change a type assertion?
You dig through the git log, and the commit message reads “fix: update auth” — which is as good as nothing.
💡 The AI did a lot, but nobody recorded the “why”. Git only tracks what changed, not the thought process behind it.
Core Tool: pi-session-analyzer
pi-atelier provides Session Analyzer to search and analyze historical sessions:
| Feature | Description |
|---|---|
| Cross-session search | Search all historical sessions by keyword |
| Search by file | Find all sessions that modified a specific file |
| Timeline view | View the complete flow of a session chronologically |
| Summary generation | Auto-summarize what a session accomplished |
| Branch analysis | Analyze parallel branches created by /tree |
| Takeover report | 5-dimensional context to help the AI quickly resume work |
| Audit check | Check for rule violations in a session |
💡 About pi-journal: pi-journal can generate daily/weekly reports via the
/journalcommand orjournaltool, aggregating git activity, memory changes, and session activity from three data sources. See 4.2 pi-journal Principles for details.
Real-World Example: Finding Last Week’s Bug Fix
Scenario: Cross-Session Search
You say: “Help me find the session where I fixed the DuckDB timezone issue.”
The AI invokes session_search:
🔍 Searching for "DuckDB timezone" — found 2 matching sessions:
1. 05-22 19:36 — DuckDB timezone config fix
Modified src/db/connection.ts
2. 05-20 14:30 — Database initialization discussion
Discussed timezone strategy but didn't change code
Scenario: Viewing Session Details
You say: “Show me the detailed process of the first session.”
The AI invokes session_analyze, generating a timeline:
📅 Session Timeline (05-22 19:36)
[19:36] User: DuckDB queries return UTC time, need to switch to China timezone
[19:37] AI: Read db/connection.ts
[19:38] AI: Found no timezone parameter set in the connection
[19:39] AI: Modified connection.ts, added SET timezone = 'Asia/Shanghai'
[19:40] AI: Ran tests, 2 tests failed
[19:41] AI: Found test mocks missing timezone setting
[19:42] AI: Modified test mocks
[19:43] AI: All tests passed ✅
[19:44] AI: Committed git commit "fix: set DuckDB timezone to Asia/Shanghai"
Scenario: Tracking by File
You say: “Who modified connection.ts recently?”
The AI invokes session_search in file mode:
📄 Modification history for src/db/connection.ts:
1. 05-22 19:36 — DuckDB timezone fix (you)
2. 05-20 14:30 — DB connection pool config (colleague)
3. 05-18 09:15 — Initial creation (you)
Session Analyzer’s Compact Mode
The entries action supports a compact parameter for streamlined output, ideal for quickly browsing large sessions:
# Standard mode
session_analyze(action="entries")
→ Complete records (with timestamps, type, full content)
# Compact mode
session_analyze(action="entries", compact=true)
→ Removes type column, time shows only HH:MM, previews 60 chars
→ Best for large sessions with 100+ entries
Session Analyzer Analysis Dimensions
| Mode | Purpose | Example Command |
|---|---|---|
summary | Session overview | “What did this session do?” |
entries | Per-event listing | “List all file modifications” |
timeline | Chronological flow | “What order did the AI operate in?” |
chain | Sub-agent tracking | “What did the sub-agent do?” |
audit | Compliance check | “Were there any rule violations?” |
digest | Conversation sequence | “What did I and the AI discuss?” |
takeover | Handover report | “Help me pick up where I left off” |
The Most Useful Mode: takeover
takeover generates a handover report with 5 dimensions:
📋 Session Takeover Report
1. User intent: Fix DuckDB timezone issue
2. Modified files: connection.ts, connection.test.ts
3. Recent steps: Modified test mocks, tests pass
4. Next steps: Consider documenting timezone behavior
5. Key decisions: Chose to handle timezone at the connection layer, not the SQL layer
When you want to “continue where you left off,” this report helps you (or another AI) quickly restore context.
Best Practices
✅ Efficient Use of Session Analyzer
grepmode: Search keywords across all sessions (much faster than digging through git log)filemode: Find all sessions touching a specific file (a code review essential)takeovermode: When taking over someone else’s work, generate a handover report firstcompactmode: Quickly browse large sessions with streamlined outputauditmode: Periodically check for AI rule violations
❌ Common Pitfalls
- Don’t use
session_searchas a replacement for memory — search is about looking back (what was done), memory is about knowledge (what was learned) - Don’t expect to find full source code — session records are summaries, not complete backups
Advanced Scenarios
Scenario: Auditing AI Compliance
You’ve set many rules for the AI (don’t write settings.json directly, don’t overwrite large files), but you’re not sure if it actually follows them. Use audit mode to check:
You: Audit the last session for rule violations
AI:
🛠 session_analyze(action="audit", sessionId="...")
⚠️ Found 2 issues:
1. [Violation] Direct writeFileSync(settings.json) — should use patchSettingsSectionWithBackup
2. [Warning] Large file overwrite (312 lines) — should split or use edit
The audit mode checks for: prohibited operations, whether file modification rules were followed, and any unsafe actions.
Scenario: Understanding the AI’s Exploration Process
The AI used /tree to create exploration branches — tried Plan A (failed), then Plan B (succeeded). You want to know why Plan A failed:
You: The AI tried two approaches last time. Show me the results for each branch.
AI:
🛠 session_analyze(action="branches", sessionId="...")
🌿 Branch Analysis
[Main branch] Discussed refactoring approach
[B1] Plan A — Modified function signature + compatibility layer
Result: Tests failed, compatibility layer introduced circular dependency
[B2] Plan B — New interface + incremental migration
Result: All tests passed ✅, merged back to main
This is more useful than digging through git log — git only records the final result, but branches shows the AI’s trial-and-error process.
Scenario: Tracking Sub-Agent Activities
Multiple sub-agents were spawned via subagent in a session. You want to know what each sub-agent produced:
You: What did the sub-agents do in the last session?
AI:
🛠 session_analyze(action="chain", sessionId="...")
🔗 Sub-Agent Chain
Main Agent
├──→ pv-explorer
│ Task: Analyze architecture of src/auth/ directory
│ Result: 5 modules, 2 design patterns, dependency directions correct
│
└──→ pv-reviewer
Task: Review JWT → session migration plan
Result: Found 1 data model violation, 2 test gaps
The chain mode traces the call relationships between the main agent and sub-agents, clearly showing what task each sub-agent received and what result it returned.
Next Steps
With the ability to review, both you and the AI can look back at past work. But there’s still one problem: the longer a session goes on, the more the AI tends to “get dumber” — repeating itself, forgetting previous agreements.
In the next chapter, we’ll look at how to keep the AI smart in long-running sessions.
Original: /home/lain/.pi/agent/distill/processor/read-b63ebc90-1779883939894.txt
4.2 pi-journal: Automated Log Reports
Why Do You Need Logs?
You’ve been working with pi all day and accomplished a lot:
- Fixed two bugs in the morning, along with refactoring a module
- Set up the CI pipeline in the afternoon, wrote a bunch of tests
- Researched a new approach in the evening, updated memory files
At night, you want to review: “What exactly did I do today?” You check git log — it’s all fragmented commits. You check memory files — only key conclusions are recorded. You check session records — there are a dozen sessions and you don’t know where to start.
💡 You need an “auto daily report” — something that aggregates your activities scattered across git, memory, and sessions into a readable report.
What pi-journal Does
pi-journal collects data from three sources and automatically generates Markdown daily/weekly reports:
| Data Source | Collected Content |
|---|---|
| Git Activity | Scans all repos under ~/.pi/agent/git/, counts commits, file changes, lines added/deleted |
| Memory Changes | Scans global and project memory directories, identifies file changes within the time range |
| Session Activity | Scans pi session records, counts sessions, tool calls, edit operations, active duration |
Usage
Command: /journal
# Today's daily report (default)
/journal
# Specify a time range
/journal yesterday
/journal this_week
/journal 3d # Last 3 days
/journal 2025-05-27 # Specific date
Tool: journal
AI can also proactively call the journal tool to generate a report. When you say “write a log”, “what did I do today”, or “write a weekly report”, the AI will automatically trigger it.
Generated Report Format
# 📓 Daily Report — 2025-05-27
## Git Activity
- **pi-shepherd** (2 commits, +45/-12)
- Added tool_result rule support
- Fixed priority sorting bug
- **pi-context-manager** (1 commit, +30/-5)
- Added aging auto-eviction feature
## Memory Changes
- Added: debug_anti_pattern.md
- Updated: coding_standards.md
## Session Activity
- 019e6494 (45m) — Shepherd rule engine refactoring
- Tool calls: 23, Edits: 8
- 019e6203 (30m) — Context aging feature implementation
- Tool calls: 15, Edits: 5
## Summary
- Total commits: 3
- Total sessions: 12
- Total edits: 43
Best Practices
✅ Recommended Usage
- At the end of each day:
/journalto generate a daily report and review what you did - Friday wrap-up:
/journal this_weekto generate a weekly report - Let AI do it: Just say “write today’s log” and the AI will invoke the tool to generate it
⚠️ Notes
- The “AI Summary” section requires the AI to supplement after generation
- Git activity only scans repos under
~/.pi/agent/git/, not other git repos elsewhere on the system - Session activity depends on pi’s session record storage
How It Works
User inputs a time range
↓
parseTimeRange() → Parses into since/until timestamps
↓
┌─────────────┬──────────────┬──────────────────┐
│ Git Activity │ Memory Changes│ Session Activity │
│ Auto-discover│ Scan memory/ │ Get session list │
│ repos │ file timestamps│from session- │
│ git log stats│ │ analyzer │
└──────┬──────┴──────┬───────┴────────┬─────────┘
↓ ↓ ↓
renderReport() → Aggregate and render as Markdown
↓
Output report
Next Steps
pi-journal solves the “look back at the past” need. But sometimes you need more than just a review — you need to find a specific session and see what the AI was thinking at the time. In the next section, we’ll look at the detailed usage of pi-session-analyzer.
Original: /home/lain/.pi/agent/distill/processor/read-ed6e48fc-1779884015233.txt
4.3 pi-session-analyzer: Cross-Session Search
pi-session-analyzer is the “time machine” of pi-atelier — it can search and analyze all historical sessions, helping you and the AI look back at what happened.
Why Session Analysis?
Every pi conversation is recorded in JSONL files (under ~/.pi/agent/distill/), but the raw data is not human-readable. Session Analyzer transforms this data into searchable, analyzable structured information.
Three common needs:
| Need | Solution | Example |
|---|---|---|
| Find a specific session | session_search cross-session search | “Which session was the one where I fixed DuckDB?” |
| Understand session content | session_analyze single-session analysis | “What exactly happened in that session?” |
| Take over someone else’s work | takeover handover report | “Continue from where I left off last time” |
session_search: Cross-Session Search
Three search modes:
grep Mode — Keyword Search
Search the content of all sessions (including user messages and AI responses):
session_search(action="grep", query="DuckDB timezone")
Result:
3 sessions matched:
1. 05-22 19:36 — DuckDB timezone config fix
2. 05-20 14:30 — Database initialization discussion
3. 05-18 09:15 — Tech stack discussion
Search results include context snippets, so you can tell if a session is relevant without opening it.
Advanced usage: editOnly=true only searches sessions that contain file editing operations, filtering out pure discussion:
session_search(action="grep", query="settings.json", editOnly=true)
Result:
2 sessions edited settings.json
This is useful for tracking.
file Mode — Track by File
Find all sessions that modified a specific file:
session_search(action="file", query="src/auth/login.ts")
Result:
3 sessions modified this file:
1. 05-22 19:36 — Login bug fix (changed null check)
2. 05-20 14:30 — Auth module refactoring (changed function signature)
3. 05-18 09:15 — Initial creation
Use case: During code review, you want to know “why is this file the way it is” — each session represents a modification intent.
list Mode — Browse Recent Sessions
List all recent sessions:
session_search(action="list", limit=10)
Result:
Recent 10 sessions:
1. 05-27 11:24 — Check what's left to do
2. 05-27 11:08 — Payload analysis script enhancement
3. 05-27 11:06 — Roadmap session ID display fix
...
session_analyze: Single-Session Analysis
Session Analyzer offers multiple analysis dimensions for different needs:
⚠️ Note: The
actionparameter ofsession_analyzeonly accepts the following values; do not passgrep/file/list(those aresession_searchactions).
summary — Quick Overview of a Session
session_analyze(action="summary", sessionId="019e6765")
Result:
Session Summary (31 exchanges)
User intent: Fix roadmap session ID display bug
Key operations: Discovered formatTimestamps slice(0,8) truncation error
Output: 2 bug fixes, 145 tests passing
When to use: When you don’t know what a session is about, start with summary.
entries — Browse Events One by One
Supports precise filtering, pagination, and multi-dimensional positioning:
# View the last 10 entries
session_analyze(action="entries", limit=10)
# Start from entry 20 (pagination)
session_analyze(action="entries", offset=20, limit=10)
# Range extraction — 'last:50' for the end, '100-150' for a specific range
session_analyze(action="entries", range="last:50")
# By index — view entry #5 with 3 surrounding context entries
session_analyze(action="entries", index=5)
# Filter by keyword
session_analyze(action="entries", grep="edit|write")
# Filter by tool name (supports wildcards)
session_analyze(action="entries", toolName="read|edit")
# Filter by file path (matches tool parameter paths, supports wildcards)
session_analyze(action="entries", file="*.test.ts")
# Compact mode — quick browse of large sessions
session_analyze(action="entries", compact=true)
Parameter Quick Reference:
| Parameter | Description | Priority |
|---|---|---|
range | Range extraction: "5-10", "last:50" | Highest |
index | View entry #N with context (0-based) | High (mutually exclusive with offset) |
offset + limit | Pagination: start from #N, show M entries | Medium |
grep | Keyword/regex filter | Composable |
toolName | Filter by tool name (supports * wildcard, ` | ` for multiple) |
file | Filter by file path (supports * wildcard, ` | ` for multiple) |
rawIndex | Navigate back to original context position (after grep/toolName filter) | Navigation only |
rawIndex usage: After filtering with grep or toolName, use rawIndex to jump back to the original context (view surrounding entries of a filtered item):
# First find an edit record (shown as [42]) with toolName filter
session_analyze(action="entries", toolName="edit")
# Then jump back to original position to see context
session_analyze(action="entries", rawIndex=42)
When to use:
- You want to see what specific operations the AI performed
- Search for specific types of operations in a session (e.g., all file edits)
- Find all operations that touched a specific file
- Quick browse of large sessions
timeline — Timeline View
Display operations in chronological order:
session_analyze(action="timeline", sessionId="...")
Result:
📅 Timeline
[19:36] 👤 DuckDB query returns UTC time
[19:37] 🤖 Read db/connection.ts
[19:38] 🤖 Discovered no timezone parameter set
[19:39] 🤖 Modified connection.ts
[19:40] 🤖 Ran tests — 2 failures
[19:42] 🤖 Modified test mock
[19:43] 🤖 All tests passed ✅
When to use: When you want to understand the AI’s step-by-step operations and decision process.
chain — Sub-Agent Tracking
Track sub-agent call chains:
session_analyze(action="chain", sessionId="...")
Result:
🔗 Sub-agent chain
Main agent → pv-explorer (code exploration)
Main agent → pv-reviewer (plan review)
Main agent → pv-executor (execute changes)
When to use: When a session used sub-agents and you want to know what each one did.
audit — Audit Checks
Check for rule violations in a session:
session_analyze(action="audit", sessionId="...")
Result:
⚠️ Found 2 issues:
1. [Violation] Directly wrote settings.json instead of using patchSettingsSectionWithBackup
2. [Warning] Large file write overwrite (>200 lines), should split
When to use:
- Check if the AI followed project conventions
- Review someone else’s session for issues
- Regular quality checks
digest — Conversation Sequence
Extract user/assistant conversation from a session, stripping tool call details and keeping only human-readable dialogue:
session_analyze(action="digest", sessionId="...")
Result:
👤 Help me fix the roadmap display bug
🤖 Sure, let me take a look at the code first...
👤 Tests aren't passing, take a look
🤖 Found that formatTimestamps has a truncation logic error...
When to use: When you want a quick understanding of the conversational thread between user and AI without seeing tool details.
raw — Raw Data
View the raw JSONL records directly (10 entries max by default):
session_analyze(action="raw", sessionId="...", limit=5)
When to use: When none of the analysis modes above meet your needs, look at the raw data directly. Generally used for debugging or data format verification.
branches — Branch Analysis
Analyze parallel branches created by the /tree command:
session_analyze(action="branches", sessionId="...")
Result:
🌿 Branch Analysis
[Main branch] Normal workflow
[B1] Tried approach A for refactoring → Failed, returned to main branch
[B2] Tried approach B for refactoring → Succeeded
When to use: When a session used /tree to create exploration branches and you want to understand the results of each branch.
Data Storage
Session Analyzer’s data sources:
~/.pi/agent/sessions/
├── --home-lain-.pi--/ ← Session directories grouped by CWD
│ ├── 2026-05-27T..._sessionId.jsonl ← Complete record for each session
│ └── ...
└── --other-project-path--/ ← Sessions from different project directories
└── ...
Session records are in JSONL format (one JSON object per line), containing:
- User messages
- AI responses (including tool calls and results)
- Timestamps
- Branch markers
Next Steps
📖 Back to 4.1 Let AI Learn to Review for a complete usage example.
Original: /home/lain/.pi/agent/distill/processor/read-ed6e48fc-1779884015234.txt
A Survival Guide for Long Sessions
You’ve Probably Been Here
You start a long session with pi, and the AI helps you through a lot of work. By the 50th turn, you notice:
- The AI starts asking questions you’ve already answered before
- It re-proposes a solution that was already rejected
- Its code quality noticeably declines — missing error handling, missing type definitions
- Sometimes it even starts “hallucinating” — inventing functions and files that don’t exist
The worst case: The AI throws an error — “context window exceeded”, and the entire session crashes.
💡 This is the “context bloat” problem: The AI’s “working memory” has a fixed capacity, and when it’s overloaded, things spill over.
Root Cause
The AI’s context window is a fixed-size “workbench”:
Context Window (e.g., 128K tokens)
┌──────────────────────────────────────┐
│ System Prompt ≈ 5K │
│ Tool Definitions ≈ 8K │
│ Memory Injection ≈ 2K │
│ ───────────────────────────── │
│ Conversation History ≈ 80K │ ← Main source of bloat
│ (previous 50 turns) │
│ Tool Results ≈ 30K │ ← Tool returns can be large
│ ───────────────────────────── │
│ Remaining Space ≈ 3K │ ← Almost full!
└──────────────────────────────────────┘
The problem:
- Conversation history only grows: Each turn adds content to the context and never removes anything
- Tool results can be massive:
readon a 1000-line file can take up 5K tokens - Duplicate information accumulates: The AI reads the same file multiple times, each time consuming space
Two Tools: Smart Compact and Context Manager
You might ask: do I need to install both packages? The answer is: yes, both are recommended — they solve different problems:
| Dimension | pi-smart-compact | pi-context-manager |
|---|---|---|
| What it does | Compresses conversation history | Diagnoses token consumption + compresses tool results |
| Active/Passive | Fully automatic | Diagnosis needs you to ask AI, distill is automatic |
| What it solves | “Conversation history is too long” | “Too many tool results” + “Why is it so slow” |
| Can they replace each other? | ❌ No | ❌ No |
💡 One-sentence summary: context-manager helps you find the problem (where are the tokens going), smart-compact helps you automatically fix it (compress history). Best used together.
pi-smart-compact — Smart Compression
Smart Compact automatically “compresses” conversation history when the context is nearly full:
Before compression (80K tokens of conversation history):
┌─────────────────────────────────┐
│ User: Help me look at auth.ts │
│ AI: I read auth.ts... (500 words)│
│ User: Add a null check │
│ AI: OK, I modified... (300 words)│
│ User: Run the tests │
│ AI: Test results... (200 words) │
│ ... 50 turns repeated ... │
└─────────────────────────────────┘
After compression (15K tokens summary):
┌─────────────────────────────────┐
│ Summary: │
│ - Added null check in auth.ts │
│ - Modified corresponding test │
│ - All tests passed │
│ - Used JWT authentication scheme│
│ ... Key information preserved ...│
└─────────────────────────────────┘
Two-phase compression strategy:
| Phase | Method | Description |
|---|---|---|
| Phase 1 | Extract key information (decisions, file changes, conclusions) | Traverse conversation, generate structured intent summary |
| Phase 2 | Discard low-value information (repeated file reads, intermediate debug output) | Let LLM judge tool results batch by batch |
pi-context-manager — Diagnostic Tool
pi-context-manager provides the payload_analyze tool to help you see exactly where your tokens are going:
📊 Token Budget Analysis
System Prompt: 4,200 tokens ( 3.2%)
Tool Definitions: 8,100 tokens ( 6.2%)
Memory Injection: 2,300 tokens ( 1.8%)
Conversation: 52,400 tokens (40.0%)
Tool Results: 64,800 tokens (49.5%) ← The big one!
──────────────────────────────────────
Total: 131,800 / 128,000 ← Over budget!
Top 3 most expensive tool calls:
1. read(src/database/schema.ts) — 8,200 tokens
2. code_graph_module_overview — 6,400 tokens
3. grep("TODO|FIXME") — 4,100 tokens
Real-World Case: Diagnosing a Context Crash
Real Scenario Review
Once, a session crashed at only 34.8% context usage. It seemed unlikely — only a third used?
Using pi-context-manager’s budget mode for analysis, the root cause was found:
Root cause:
34.8% of tool results were error output
→ Lots of repeated "Command not found" error messages
→ Each error consumed tokens without providing valuable information
→ Accumulated and prematurely exhausted the context
Solution: Added an after_bash hook in shepherd rules to automatically truncate error output from failed commands, preventing wasteful token consumption.
Using growth Mode for Trends
📈 Context Growth Trend
Request #1: 15K ████
Request #5: 28K ███████
Request #10: 45K ████████████
Request #15: 72K ████████████████████
Request #20: 98K ██████████████████████████ ← Approaching limit
Request #23: 💥 Crash!
Key finding: The fastest growth was between turns 10-15, when extensive file searching was happening.
Optimization: Using code_graph_semantic_code_search (compact mode, returns only signatures and locations) instead of grep (returns full matching lines) reduced token consumption by 70%.
Configuring Smart Compact
Install via settings.json:
{
"packages": ["pi-smart-compact"]
}
It takes effect automatically after installation — no additional configuration needed.
Optional Advanced Configuration
In .pi/settings.json:
{
"smart-compact": {
"auto": "auto"
}
}
auto: Automatic trigger (default, recommended)manual: Only responds to the/smart-compactcommand
Manual trigger: Type /smart-compact in the conversation.
View configuration: Type /smart-compact-config.
Payload Analyzer Common Commands
| Command | Purpose | When to Use |
|---|---|---|
budget | Token budget analysis (system/tools/history composition) | “Where are my tokens going?” |
growth | Context growth trend (token curve over requests) | “Why is it getting slower?” |
expensive | Most expensive tool calls (Top N sorted) | “Which tool consumes the most tokens?” |
overview | Per-message detailed analysis (includes distill events) | “Pinpoint a specific point in time” |
messages | Locate messages by index/range/keyword | “What did message 10 say?” |
chain | Track the same tool call across payloads | “What happened to this call later?” |
chain-tcid | Track the same toolCallId across payloads | “Verify distill behavior” |
diff | Compare differences between two payloads | “What’s different between these two requests?” |
stats | Aggregate statistics on distill/processor hit rate | “How efficient is the compression?” |
single | Analyze a single payload file | “Deep dive into one recording file” |
list | List all recording files | “What’s available for analysis?” |
💡 Diagnosis workflow: Start with
listto see available recordings →budgetfor overall distribution →expensiveto find the heavy hitters →messagesfor precise targeting.
Context Manager’s Aging and Processor
Beyond Distill, pi-context-manager offers two additional helper mechanisms:
Aging
Automatically evicts old tool outputs that haven’t been referenced for a long time. Use /aging-config to set the eviction rounds.
Special exemption: Skill file (SKILL.md) content is never evicted by aging, ensuring the AI always sees the currently loaded skills.
Tool Result Processor
Formats and trims specific types of tool output (e.g., code-graph AST search results, MCP JSON output). Use /processor-config to set thresholds.
/context TUI Panel
Type /context to open a visual panel for browsing context content by category and manually marking content for deletion.
Best Practices
✅ Habits for Healthy Long Sessions
- Use compact mode for search:
code_graph_semantic_code_search(compact: true)saves 70% tokens overgrep - Compress early: Don’t wait until it crashes — trigger compression when the context approaches its limit
- Avoid repeated reads: Use memory to remember file content instead of repeatedly
reading the same file - Read large files in chunks: Use
offset/limitto read only the parts you need, not the whole file - Configure aging wisely: Set 8-12 rounds for eviction to automatically clean up stale content
- Run payload_analyze checkups regularly: Run
budgetonce during a long session to catch problems early
✅ Diagnostic Priority
When context has issues:
1. payload_analyze budget → Check total distribution
2. payload_analyze expensive → Find the most expensive calls
3. payload_analyze growth → Look at growth trends
4. Targeted optimization (switch tools, add filters, adjust strategy)
❌ Common Misconceptions
- “I still have 50% context left, nothing to worry about” → Wrong, tool results can suddenly balloon
- “Compression will lose important information” → Smart Compact prioritizes preserving decisions and conclusions
- “Just restart the session” → Treats the symptom, not the root cause — you’ll run into the same problem again
Next Steps
Now the AI has memory, planning, rules, review, and compression — it’s already quite a capable assistant. But it’s still “passive” — it only acts when you ask.
Can the AI work proactively? For example, automatically check code quality every day, or automatically run a research analysis?
In the next chapter, we’ll look at how to make the AI automate work.
Original: /home/lain/.pi/agent/distill/processor/read-ed6e48fc-1779884015234.txt
5.2 pi-smart-compact Principles: Two-Phase Enhanced Compaction
Smart Compact is an enhanced version of pi’s built-in Compaction mechanism — instead of simply truncating history, it “intelligently” decides what to keep and what to discard.
Why Enhanced Compaction?
pi’s built-in Compaction automatically compresses old conversations when the context approaches its limit, but it isn’t “smart” enough:
Built-in Compaction:
100 rounds of conversation before compression → a generic summary after compression
Problem: the summary is too coarse, critical details are lost, and tool call results are indiscriminately truncated.
Smart Compact’s improvement — intercepts pi’s compaction event and performs two-phase enhanced compaction:
| Phase | What It Does | How It Works |
|---|---|---|
| Phase 1: Intent Summary | Extract user intent, key decisions, current state | Traverse conversation, extract non-tool text from AI replies, generate structured intent summary |
| Phase 2: Tool Filtering | Determine which tool call results can be safely discarded | Pair all tool calls (call + result), let LLM decide keep/discard in batches |
pi triggers compact event
→ Smart Compact takes over (if auto mode is on)
→ Phase 1: Extract intent summary (keep decisions, agreements, conclusions)
→ Phase 2: Evaluate tool results for keep/discard in batches
→ Output a refined conversation history, replacing pi's default rough summary.
The two phases are executed sequentially in one go — Smart Compact takes over the compaction event, first performs the intent summary, then filters tools, and finally outputs the refined result. They are not triggered in stages based on context usage rate.
Configuration
Installation
{
"packages": ["pi-smart-compact"]
}
Commands
| Command | Usage |
|---|---|
/smart-compact | Manually trigger two-phase compaction |
/smart-compact-config [auto|manual] | View or switch between auto/manual mode |
Auto/Manual Mode
auto(default): Automatically takes over when pi triggers a compact event, performing enhanced compactionmanual: Only triggers when the user executes/smart-compact
Advanced Configuration
Configure in ~/.pi/agent/settings.json:
{
"smart-compact": {
"enabled": true,
"intentModel": "",
"filterModel": "",
"thinkingTruncateChars": 500,
"toolCallTruncateChars": 2000,
"toolResultTruncateChars": 5000,
"filterBatchSize": 10
}
}
| Parameter | Default | Description |
|---|---|---|
intentModel | Empty (uses session default model) | Model used for Phase 1 intent summary |
filterModel | Empty (uses session default model) | Model used for Phase 2 tool filtering |
thinkingTruncateChars | 500 | Character limit for truncating thinking blocks |
toolCallTruncateChars | 2000 | Character limit for truncating toolCall arguments |
toolResultTruncateChars | 5000 | Character limit for truncating toolResult content |
filterBatchSize | 10 | Number of tools evaluated per batch in Phase 2 |
What Does Compaction Preserve?
Smart Compact’s Phase 2 evaluates tool results based on the following priority:
| Priority | Content Type | Why Preserve |
|---|---|---|
| 🔴 Highest | User’s explicit requirements and constraints | These are the task objectives |
| 🟠 High | Key decisions and reasoning for choices | Prevents AI from re-debating already rejected solutions |
| 🟡 Medium | File modification records (edit/write) | Lets AI know which files have been modified |
| 🟢 Low | File reads and search results | Can be re-executed |
| ⚪ Lowest | Failed attempts and debugging process | Lessons have already been learned |
Best Practices
- Enable auto mode for long sessions: Smart Compact automatically takes over when pi is about to compact, preserving more critical information than the default compaction
- Manual trigger is useful before critical operations: Run
/smart-compactmanually to clean up context before starting an important refactoring - Use with context-manager: Smart Compact compresses conversation history, while Context Manager’s distill compresses tool outputs — they complement each other
- Use cheaper models for compaction: If you don’t want to waste the main model’s tokens, specify
filterModelin the configuration to use a cheaper model
📖 Back to 5.1 Long Session Survival Guide for complete diagnosis and optimization cases.
5.3 Diagnosing Token Consumption with pi-context-manager
The functionality of pi-payload-analyzer has been merged into pi-context-manager. This section introduces how to use the unified
payload_analyzetool to diagnose context issues.
Where Did All the Tokens Go?
In long sessions, AI becomes less intelligent often because the context is filled with “junk.” But what exactly is consuming tokens? Guessing won’t help.
pi-context-manager provides the payload_analyze tool, which uses data to tell you where your tokens are going.
Recording Must Be Enabled First
payload_analyze requires recorded payload data before it can analyze. In the conversation, type:
/record on
Recordings are saved to ~/.pi/agent/distill/recordings/. There is a slight performance overhead while recording; remember to turn it off with /record off when done.
Analysis Mode Quick Reference
Global Overview
| Mode | Usage | Output |
|---|---|---|
list | List all recording files | File list + sizes |
budget | Token budget analysis | Breakdown of system/tools/history |
growth | Growth trend | Token usage curve over requests |
stats | Aggregate statistics | Distill/processor hit rate, compression efficiency |
Deep Diagnosis
| Mode | Usage | Output |
|---|---|---|
expensive | Most expensive tool calls | Top N sorted by token count |
overview | Per-message detailed analysis | Token breakdown per message + distill events |
messages | Precise message targeting | Filter by index/range/keyword |
Tracking & Comparison
| Mode | Usage | Output |
|---|---|---|
chain | Track tool call fate | Cross-payload changes for the same argsSig |
chain-tcid | Track toolCallId | Verify distill behavior |
diff | Compare two payloads | Identify differences between two requests |
single | Analyze a single file | Full analysis of one recording file |
Messages Mode — Precise Targeting
messages is the most flexible diagnostic tool, supporting multiple filtering methods:
# View message #5 (0-based)
payload_analyze(action="messages", msgIndex=5)
# View messages 5-10
payload_analyze(action="messages", msgRange="5-10")
# View the last 5 messages
payload_analyze(action="messages", msgRange="last:5")
# Filter by keyword
payload_analyze(action="messages", grep="error|fail")
# Filter by tool name
payload_analyze(action="messages", toolName="read")
# Filter by file path
payload_analyze(action="messages", file="*.ts")
Practical Cases
Case 1: Find the Root Cause of Context Bloat
Step 1: Use budget mode to see totals
You: "Use payload_analyze to analyze token budget"
Result: Tool Results account for 49.5%
Step 2: Use expensive mode to find the biggest consumers
You: "Find the Top 10 most token-consuming tool calls"
Result: read(schema.ts) consumes 8.2K tokens
Step 3: Optimize
→ Use offset/limit to read large files in chunks
→ Or enable distill for automatic compression
Case 2: Diagnose Compression Efficiency
Step 1: Use stats mode to check hit rate
You: "Check distill and processor compression efficiency"
Result: distill hit rate 75%, processor hit rate 60%
Step 2: Use chain mode to track
You: "Track distill behavior for read(schema.ts)"
Result: Distilled on the 3rd request, compressed from 8.2K to 1.5K
Case 3: Compare Differences Between Two Requests
You: "Compare these two payloads for differences"
AI calls payload_analyze(action="diff", payloadPath="...", payloadPath2="...")
Result: The second request has 3 additional tool calls, but 2 were compressed by distill
📖 For complete long session diagnosis cases, see 5.1 Long Session Survival Guide
5.4 Long Session Real-World Scenarios
This section walks through real-world use cases, demonstrating how to combine pi-atelier’s tools to solve common problems in long sessions.
Scenario 1: AI Gets Dumber — Compress with Smart Compact
Symptoms
You’ve been chatting with AI for 2 hours and done a lot. Suddenly you notice the AI starts to:
- Ask questions you’ve already answered
- Re-propose solutions that have already been rejected
- Code quality drops noticeably — missing error handling
Traditional Compaction vs Smart Compact
pi’s built-in Compaction triggers automatically when the context approaches its limit, but it simply compresses old conversations into a generic summary. Smart Compact is smarter:
Traditional Compaction:
100 rounds of conversation → one generic 500-word summary
Problem: critical details are lost, AI doesn't know what was decided
Smart Compact (two phases):
Phase 1 (Intent Summary):
→ Extracts: decisions, agreements, file modifications, conclusions
→ Preserves all critical information, discards redundant processes
Phase 2 (Tool Filtering):
→ Evaluates each tool result batch by batch to decide whether to keep it
→ Discards: repeated reads, failed attempts, debugging processes
Steps
1. AI has already done a lot for you, and you feel the context is nearly full
2. Type /smart-compact
3. Smart Compact analyzes the conversation history and generates an enhanced summary
4. AI continues working, but "remembers" all key decisions
Or do nothing — if configured in auto mode (default), Smart Compact will trigger automatically at the right time.
Works Better with Context Manager
Smart Compact compresses conversation history, while Context Manager’s Distill compresses tool results. Using them together:
Context Window
├── Conversation History ←── Compressed by Smart Compact (preserves decisions and conclusions)
├── Tool Results ←── Compressed by Distill (preserves key info, discards redundancies)
└── Memory Injection ←── Fixed size, unchanged
Scenario 2: Context Already Exploded — Handoff with Takeover
Symptoms
A more extreme case: AI reports an error — “context window exceeded.” The entire session can no longer continue.
At this point, mere compression is too late — the session has crashed outright.
Solution: Start a New Session + Takeover
1. Open a new session
2. Tell AI: "Help me take over the last session's work"
3. AI calls session_analyze(action="takeover")
4. Generates a 5-dimension takeover report:
📋 Session Takeover Report
1. User Intent: Refactor auth module, switch from JWT to session-based auth
2. Modified Files:
- src/auth/middleware.ts (done)
- src/auth/login.ts (in progress, 80% complete)
- src/auth/__tests__/login.test.ts (to be written)
3. Recent Steps:
- Modified type signatures in middleware.ts
- Started modifying login.ts but not finished
- Tests not yet written
4. Next Steps:
- Complete login.ts modifications
- Write login.test.ts tests
- Run full test suite
5. Key Decisions:
- Chose session-based over refresh token approach
- Reason: project doesn't need cross-domain SSO
Takeover’s 5 Dimensions
| Dimension | What It Contains | Why It Matters |
|---|---|---|
| User Intent | Original requirements and goals | New AI knows “what to do” |
| Modified Files | List of files changed and pending | Knows “what was changed” |
| Recent Steps | Last 3-5 operations | Knows “how far we got” |
| Next Steps | What still needs to be done | Knows “what to do next” |
| Key Decisions | Important technical choices and rationale | Knows “why it was done this way” |
With these 5 dimensions, a new session’s AI can fully recover working context within 1-2 conversation rounds.
Example
You: Help me take over the Godot MCP Bridge work
AI:
🛠 session_analyze(action="takeover", sessionId="019e620f-...")
📋 Takeover report generated. You were debugging the MCP Bridge WebSocket
connection issue and had identified that nohup was swallowing the listen()
return value. The next step is to start the TCP port directly without nohup.
Shall I continue?
Scenario 3: Want to Know What Happened — Session Analysis
Symptoms
Yesterday you had AI do a ton of things. Today you want to review: “How exactly was that DuckDB timezone bug fixed?”
Steps
Step 1: Cross-session search
You: Search for the session where DuckDB timezone was fixed
AI:
🛠 session_search(action="grep", query="DuckDB timezone")
Found 2 matching sessions:
1. 05-22 19:36 — DuckDB timezone configuration fix
2. 05-20 14:30 — Database initialization discussion
Step 2: View timeline
You: Show the detailed process of the first one
AI:
🛠 session_analyze(action="timeline", sessionId="...")
📅 Timeline:
[19:36] User: DuckDB query returns UTC time
[19:37] AI: Reads db/connection.ts
[19:38] AI: Discovers no timezone parameter set
[19:39] AI: Adds SET timezone = 'Asia/Shanghai'
[19:43] AI: All tests pass ✅
Step 3: View original conversation (if more detail needed)
You: Show the conversation around 19:39
AI:
🛠 session_analyze(action="entries", msgRange="5-10", sessionId="...")
Common Analysis Pattern Combinations
Quick overview of a session: summary → see what was done
Trace operation order: timeline → see the steps
View raw conversation: entries → see the details
Take over someone's work: takeover → get the context
Check for violations: audit → see if there are problems
Scenario 4: Tool Output Fills the Context — Aging Automatic Eviction
Symptoms
AI has used many tools in one session: read 20 files, ran 10 searches, executed 5 bash commands. Every tool’s returned result stays in the context.
The problem is: you only need the most recent results. That file you read 20 minutes ago is no longer needed now.
Solution: Aging Automatic Eviction
Aging automatically evicts tool outputs that haven’t been referenced again after a specified number of rounds:
Timeline:
Round 1: read(auth.ts) → 5K tokens
Round 2: read(middleware.ts) → 4K tokens
Round 3: grep("TODO") → 3K tokens
...
Round 10: edit(auth.ts) ← auth.ts is referenced again, "life extended"
Round 1+8=9: grep("TODO") → not referenced for 8 rounds, auto-evicted ✅
Round 2+8=10: middleware.ts not referenced again, auto-evicted ✅
Round 1+8=9: auth.ts → referenced by edit, preserved!
Configuring Aging
/aging-config 8 # Evict after 8 rounds (recommended: 8-12)
/aging-config off # Disable auto-eviction
💡 Skill file exemption: SKILL.md content is not evicted by aging — AI always retains access to currently loaded skills.
Manual Intervention: /context TUI Panel
If you don’t want to wait for auto-eviction, you can manually mark items for deletion:
1. Type /context to open the TUI panel
2. Browse by category: view all context content by tool type or chronological order
3. Select unwanted content, mark it for deletion
4. Marked content won't be included in the next AI request
This is especially useful when:
- AI read a huge config file but you only need one line from it
- A search returned 50 results but you only used 3
- Error messages from earlier debugging are no longer needed
Scenario 5: Why Is AI Getting Slower — Token Budget Diagnosis
Symptoms
AI’s response keeps getting slower, and the wait time per conversation round is noticeably longer. You suspect the context is too large, but you don’t know exactly what’s taking up space.
Steps
Step 1: Enable recording
You: /record on
... do a few more rounds ...
Step 2: Check token budget
You: Analyze the token budget
AI:
🛠 payload_analyze(action="budget")
📊 Token Budget Analysis
System Prompt: 4,200 (3.2%)
Tool Definitions: 8,100 (6.2%)
Memory Injection: 2,300 (1.8%)
Conversation: 52,400 (40.0%)
Tool Results: 64,800 (49.5%) ← This is the big one!
Step 3: Find the most expensive calls
You: Find the most token-consuming tool calls
AI:
🛠 payload_analyze(action="expensive", topN=5)
Top 5 Most Expensive Tool Calls:
1. read(src/database/schema.ts) — 8,200 tokens
2. code_graph_module_overview — 6,400 tokens
3. grep("TODO|FIXME") — 4,100 tokens
4. read(src/config/settings.ts) — 3,800 tokens
5. bash("npm test") — 3,200 tokens
Step 4: Targeted optimization
→ schema.ts is too large, use offset/limit to read only the needed parts
→ Use compact=true mode for code_graph
→ Add --include to grep to limit file types
Diagnosis Flow Quick Reference
1. budget → See overall distribution (which part has the highest share?)
2. expensive → Find the big consumers (which specific calls use the most tokens?)
3. growth → See the trend (which period had the fastest growth?)
4. messages → Pinpoint (take a look at that specific message content)
5. Targeted optimization (switch tools, add filters, enable distill)
Combined Scenario: Complete Survival Strategy for Ultra-Long Sessions
Session starts
│
├── Rounds 1-20: Working normally, no concerns
│
├── Rounds 20-40: Context usage ~40%
│ → Enable /record on (optional, to prepare for later diagnosis)
│ → Avoid repeatedly reading large files
│
├── Rounds 40-60: Context starting to get tight
│ → Smart Compact takes over pi's compaction event (if auto mode is on)
│ → Or manually trigger /smart-compact
│ → Aging starts evicting old tool outputs
│
├── Rounds 60-80: Approaching the limit
│ → Smart Compact has completed two-phase compaction
│ → Consider whether to start a new session
│ → If continuing: check with payload_analyze budget
│
├── 💥 Crashed!
│ → Start a new session
│ → session_analyze(action="takeover") to take over
│ → Continue working
│
└── Before wrapping up:
→ /record off
→ Have AI generate a daily report with /journal
→ agent_end auto-reminds to commit + update memory
📖 Back to 5.1 Long Session Survival Guide for a complete tool introduction.
Automation & Workflows
You Might Have Encountered This
Every Friday afternoon, you do the same thing:
- Check all sessions from this week to see what files were changed
- Run tests to make sure there are no regressions
- Check for uncommitted code
- Write a weekly summary — recap the week’s progress
Each time you have to manually remind the AI to do these things. Sometimes you forget, and come back on Monday only to find that last Friday’s changes were never committed.
💡 An AI can do things, but it won’t “actively” do things. You need to tell it “what to do now.”
Two Tools: Scheduler and Workflow
pi-scheduler — Scheduled Tasks
The Scheduler lets the AI do specific things at specific times:
Scheduled trigger
│
▼
┌──────────────────┐
│ Inject preset │
│ message │
│ "It's Friday PM, │
│ run weekly check" │
└──────────────────┘
│
▼
AI executes automatically
Supported schedule types:
| Type | Description | Example |
|---|---|---|
| One-shot | Triggers once after a specified time | “Remind me about the meeting in 30 min” |
| Recurring | Repeats at a fixed interval | “Check tests every 2 hours” |
pi-workflow — Sub-agent Orchestration
Workflow lets the AI break complex tasks into multiple sub-agents that execute in parallel:
Main agent: "Research best practices for XXX"
│
├──→ Sub-agent 1: Search online resources
├──→ Sub-agent 2: Search GitHub source code
└──→ Sub-agent 3: Search historical sessions
│
▼
Main agent: Synthesize results from all three sub-agents and give recommendations
Sub-agents are independent execution environments:
- They have their own context window (no pollution of the main session)
- They have their own tool set (permissions can be restricted)
- They return results to the main agent when done
Real-world Example: Automated Weekly Report
Configuring Automatic Friday Checks
{
"action": "create",
"interval_ms": 604800000,
"recurring": true,
"prompt": "It's Friday afternoon. Please perform the following checks:\n1. Use session_search to review all sessions from this week\n2. Use session_analyze summary to summarize each session\n3. Check git status for uncommitted changes\n4. Compile a weekly summary and write it to .pi/journal/weekly-summary.md\n5. Remind the user to commit uncommitted code"
}
Every Friday, the AI will automatically:
- Search all sessions from the week
- Generate summaries for each one
- Check git status
- Generate a weekly report
- Remind you to commit
Everyday Use of Scheduled Reminders
User: "Remind me to check CI results in 30 minutes"
AI: ✅ Scheduled task created, reminder in 30 minutes
... 30 minutes later ...
AI: ⏰ Reminder: Time to check CI results. Current time: 15:30.
Real-world Example: Sub-agent Research
Scenario: Researching Best Practices for a New Technology
You say: “Research the performance differences between Bun and Node.js and give me a recommendation.”
The AI launches a research workflow:
🔬 Research workflow started
Sub-agent 1 (Search Expert):
→ Search "Bun vs Node.js performance benchmark 2026"
→ Found 5 technical articles
→ Extract key data points
Sub-agent 2 (Source Expert):
→ Search Bun's GitHub repository
→ Browse the benchmark directory
→ Review performance discussions in issues
Sub-agent 3 (History Expert):
→ Search project's historical sessions
→ Check if similar evaluations were discussed before
─────────────────────────
Main agent comprehensive report:
📊 Bun vs Node.js Recommendation
1. Performance comparison:
- HTTP throughput: Bun is 3-4x faster
- Startup time: Bun is 5x faster
- npm compatibility: Node.js is better (Bun 95% compatible)
2. Recommendation for this project: Stick with Node.js
- Rationale: The project depends on multiple Node.js native modules
- Bun's compatibility issues could lead to extra maintenance costs
- The performance difference has little impact on this project (I/O bound)
Advantages of Sub-agents
| Feature | Single Agent (Normal Chat) | Multi-agent (Workflow) |
|---|---|---|
| Context isolation | All information mixed together | Each sub-agent is independent |
| Parallel execution | Sequential, one by one | Can search in parallel |
| Error isolation | One error affects everything | Sub-agent errors don’t affect others |
| Token efficiency | All information in main context | Only final results return to main context |
Scheduler Configuration
Install via settings.json:
{
"packages": ["pi-scheduler"]
}
The tool provides three operations:
| Operation | Description | Parameters |
|---|---|---|
create | Create a scheduled task | interval_ms, prompt, recurring |
list | View all tasks | None |
cancel | Cancel a task | id |
Common Time Intervals
| Interval | interval_ms | Use case |
|---|---|---|
| 30 minutes | 1,800,000 | Short-term reminders |
| 2 hours | 7,200,000 | Periodic checks |
| Daily | 86,400,000 | Daily report / daily check |
| Weekly | 604,800,000 | Weekly report / weekly check |
Workflow Configuration
Install via settings.json:
{
"packages": ["pi-workflow"]
}
Workflow provides two core concepts:
- Factor Research: Multi-round search + evaluation + synthesis
- Factor Optimization: Initial screening + dissection + combination + iteration + validation
Usually you don’t need to interact with Workflow directly — the AI automatically decides whether to use sub-agents based on task complexity.
Best Practices
✅ Good Scheduled Task Design
- Clear instructions: Tell the AI exactly what to do, avoid vague “check it out”
- Reasonable intervals: Don’t check every 5 minutes (wastes resources)
- Meaningful triggers: A reminder should say “do X now,” not “are you there”
✅ Good Sub-agent Design
- Single responsibility: Each sub-agent should do only one thing
- Clear output: Sub-agents should return structured results, not free-form text
- Moderate parallelism: 3-5 sub-agents is the sweet spot — too many increases synthesis difficulty
❌ Common Pitfalls
- “Scheduled tasks can replace all manual operations” → No, complex decisions still require human involvement
- “More sub-agents is always better” → Too many sub-agents may cost more to synthesize than they save
- “Workflow can do anything” → It’s great for research and analysis, not suitable for decisions requiring human judgment
Next Steps
So far, we’ve covered all the core tools provided by pi-atelier. But what if these tools aren’t enough — what if you want to build something that doesn’t exist yet?
In the next chapter, we’ll look at how to develop your own extensions.
6.2 pi-scheduler: Scheduled Tasks
pi-scheduler is the “alarm clock” of pi-atelier — it can automatically inject messages to the AI at specified times, enabling the AI to proactively execute tasks.
Why Scheduled Tasks?
The AI is reactive — it only answers when you ask. But some things need to happen on time:
- “Remind me to check CI results in 30 minutes” — you might forget
- “Check tests every 2 hours” — manual reminders are tiring
- “Remind me to commit before leaving work every day” — afraid of forgetting
The Scheduler gives the AI “time awareness.”
How It Works
Create a scheduled task
│
▼
┌──────────────────┐
│ Scheduler Timer │
│ Countdown wait │
└────────┬─────────┘
│ Time's up!
▼
┌──────────────────┐
│ Inject preset │
│ message into │
│ AI's context │
└────────┬─────────┘
│
▼
AI reads message
Executes task automatically
Key points:
- Injecting a message does not start a new conversation — it inserts a “reminder” into the current session
- The AI decides for itself how to execute after seeing the message, no need for you to repeat it
- Scheduled tasks are only valid in the current session; they are automatically cleared when the session ends. If the session is resumed later, tasks from that session can be restored
Three Operations
Creating a Task
schedule(
action="create",
interval_ms=1800000, // 30 minutes
prompt="Check CI build results, tell me if it fails",
recurring=false // One-shot
)
Parameter description:
| Parameter | Description | Required |
|---|---|---|
action | Fixed as "create" | ✅ |
interval_ms | Interval in milliseconds | ✅ |
prompt | Message to inject to the AI | ✅ |
recurring | Whether to repeat (default: false) | ❌ |
Listing Tasks
schedule(action="list")
Result:
📋 Current scheduled tasks:
1. [One-shot] Trigger at 14:30 — "Check CI results"
2. [Recurring] Every 2h — "Run tests for regressions"
Canceling a Task
schedule(action="cancel", id="task-123")
Common Time Intervals
| Scenario | Interval | interval_ms |
|---|---|---|
| Short-term reminder | 5 minutes | 300,000 |
| Tea break | 15 minutes | 900,000 |
| Waiting for build | 30 minutes | 1,800,000 |
| Periodic check | 2 hours | 7,200,000 |
| Daily reminder | 1 day | 86,400,000 |
One-shot vs Recurring Tasks
One-shot (recurring=false)
Best for “reminder” scenarios:
User: "Remind me about the meeting in 30 minutes"
AI → schedule(action="create", interval_ms=1800000,
prompt="⏰ Reminder: Time for the meeting.", recurring=false)
... 30 minutes later ...
AI: ⏰ Reminder: Time for the meeting. Current time: 15:30.
Automatically deleted after being triggered once.
Recurring (recurring=true)
Best for “periodic check” scenarios:
User: "Run tests every 2 hours"
AI → schedule(action="create", interval_ms=7200000,
prompt="Please run npm test and report results. If it fails, list the failing tests.",
recurring=true)
... Every 2 hours ...
AI: 📋 Periodic test report: All 47 tests passed ✅
...
AI: ⚠️ Periodic test report: 2 tests failed!
- auth.test.ts: Login timeout
- api.test.ts: 404 error
Status Bar Integration
The Scheduler displays a countdown in pi’s status bar, so you always know when the next reminder will trigger.
Notes
- Scheduled tasks only work in the current session — tasks disappear when the session is closed (but can be restored if the same session is resumed)
- Recurring tasks shouldn’t have intervals that are too short (recommended ≥ 5 minutes), to avoid frequent triggers wasting tokens
- The prompt should be specific and clear — the AI sees this exact text; vague instructions lead to vague execution
Next Steps
📖 Return to 6.1 Automation & Workflows for complete usage examples.
6.3 pi-workflow: Sub-agent Orchestration
pi-workflow is the “task dispatcher” of pi-atelier — it lets the AI break complex tasks into multiple sub-agents that execute in parallel, then synthesizes the results.
Why Sub-agents?
A single AI session has a limited context window. When a task requires:
- Searching multiple information sources simultaneously
- Executing independent operations without interference
- Reviewing the same code from different perspectives
A single AI can only process sequentially, which is inefficient, and all the information gets crammed into one context.
Sub-agents solve this — each sub-agent has its own independent context window, and only returns the final result to the main agent.
How It Works
Main agent
│ "Research performance differences between Bun and Node.js"
│
├──→ Sub-agent 1 (Search Expert)
│ Independent context window
│ Search online resources
│ Return key data points
│
├──→ Sub-agent 2 (Source Expert)
│ Independent context window
│ Search GitHub repository
│ Return benchmark data
│
└──→ Sub-agent 3 (History Expert)
Independent context window
Search historical sessions
Return previous discussion records
│
▼
Main agent synthesizes three results → Output final recommendation
Sub-agent Characteristics
| Feature | Description |
|---|---|
| Independent context | Each sub-agent has its own context window, won’t pollute the main session |
| Independent tool set | Can restrict which tools each sub-agent can use |
| Error isolation | One sub-agent failing doesn’t affect others |
| Token efficiency | Only final results return to the main context, intermediate steps don’t occupy the main window |
Available Sub-agents
Sub-agents are defined in ~/.pi/agent/agents/*.md — each .md file defines a sub-agent’s role and tool set:
| Sub-agent | Purpose | Tools |
|---|---|---|
pv-explorer | Code exploration — analyze architecture, call chains, design patterns | read, grep, find, ls |
pv-reviewer | Independent plan review — check architecture violations, dependency direction errors | read, grep, find, ls |
pv-executor | Execute code changes — implement according to plan, make tests pass | All tools |
pv-simplifier | Code simplification — identify duplication, inefficient patterns | read, grep, find, ls |
fo-analyzer | Factor analysis — run analysis scripts + parse results | bash, read |
fo-verifier | Factor verification — run final backtest + output report | bash, read |
fr-searcher | Factor search — search literature and source code | read, grep, find, ls |
fr-writer | Factor writing — synthesize research findings | read, grep, find, ls |
security-auditor | Security audit — check for security vulnerabilities | read, grep, find, ls |
Use Cases
Scenario 1: Plan Review (Plan-Verify Flow)
Before making complex changes, use pv-reviewer to independently review the plan:
Main agent: I've designed a refactoring plan, let the reviewer check it.
│
└──→ pv-reviewer
"Review the following plan: Switch the auth module from JWT to session-based"
Review result:
✅ Dependency direction is correct
⚠️ Violates data model invariant #3 (user session should be immutable)
❌ Missing test coverage — new session storage needs integration tests
Value: The reviewer looks at the plan from an independent perspective and can catch issues that the main agent “can’t see from inside.”
Scenario 2: Code Exploration
Use pv-explorer for structured code analysis without polluting the main context:
Main agent: I need to understand the architecture of this module.
│
└──→ pv-explorer
"Analyze the architecture, call chains, and design patterns of src/auth/"
Returns:
- Module structure diagram
- Core function call chains
- Design patterns used (middleware chain, strategy pattern)
- Dependency direction
Value: The exploration process may read dozens of files — if all of them were in the main context, they’d fill it up. The sub-agent only returns the essence.
Scenario 3: Security Audit
Use security-auditor to check code for security vulnerabilities:
Main agent: This code handles user input, help me do a security audit.
│
└──→ security-auditor
"Audit input validation and injection risks in src/api/handlers/"
Findings:
⚠️ SQL injection risk: query parameters directly concatenated into SQL
⚠️ XSS risk: user input returned as HTML without escaping
✅ Auth checks: all endpoints have auth middleware
Sub-agents vs Normal Chat
| Dimension | Normal Chat | Sub-agent |
|---|---|---|
| Context | All information in one window | Each sub-agent has its own window |
| Parallelism | Sequential processing | Can be parallel |
| Error impact | Global impact | Isolated |
| Token consumption | Intermediate steps all occupy main window | Only results occupy main window |
| Best for | Simple tasks, Q&A | Complex research, multi-party review |
💡 Rule of thumb: If a task requires reading many files but ultimately needs just one conclusion, use sub-agents. If it requires multi-turn interactive discussion, use normal chat.
Notes
- Sub-agents are one-shot — they return results and terminate, they don’t maintain state
- Sub-agents have restricted tool sets —
pv-exploreronly has read-only tools and cannot modify files - Sub-agents cannot see the full context of the main session — they only see the
taskdescription you pass to them - Therefore, the task description passed to a sub-agent must be sufficiently detailed — include all necessary background information
Next Steps
📖 Return to 6.1 Automation & Workflows for complete usage examples.
Write Your Own Extensions
Why Write Your Own Extension?
pi-atelier provides 10 extensions covering core scenarios like memory, planning, rules, retrospective, compression, and automation. But every project has its own special needs:
- Your team uses Feishu instead of Slack, so you need a Feishu notification extension
- You’re doing game development and need an extension to automatically manage the assets directory
- You’re writing academic papers and need an extension for LaTeX compilation + citation checking
💡 At its core, an extension is about giving AI new tools and new knowledge.
Extension Architecture
What Makes Up an Extension?
pi-xxx/
├── package.json # Package metadata + pi extension configuration
├── index.ts # Entry point, registers tools and hooks
├── lib/ # Tool implementations
│ └── tools-xxx.ts
├── prompts/ # Prompt templates (descriptions visible to AI)
│ └── xxx-description.md
└── README.md # Documentation
Core Concepts
| Concept | Description | Analogy |
|---|---|---|
| Tool | A function AI can call | Giving AI a new hammer |
| Hook | Logic executed at specific moments | Giving AI an alarm clock |
| Prompt | Description of the tool (what AI sees) | Telling AI how to use this hammer |
| Config | User-configurable parameters | The hammer’s force adjustment |
Extension Lifecycle
1. pi starts
│
▼
2. Load packages from settings.json
│
▼
3. Install/update extensions (npm or git)
│
▼
4. Execute extension entry function `export default function(pi)`
│
├── Register tools (pi.registerTool)
├── Register commands (pi.registerCommand)
└── Listen to events (pi.on)
│
▼
5. AI session can now call the new tools
Hands-On: Writing a “Code Stats” Extension from Scratch
Let’s build a simple extension step by step — counting lines of code in a project.
Step 1: Create the Project
mkdir pi-code-stats
cd pi-code-stats
npm init -y
Modify package.json:
{
"name": "pi-code-stats",
"version": "0.1.0",
"main": "index.ts",
"piExtension": true
}
💡
"piExtension": truetells pi this is an extension package."main"points to the entry file (TypeScript or JavaScript — both work; pi uses jiti to load them).
Step 2: Write the Tool Implementation
lib/tools-stats.ts:
import { execSync } from 'child_process';
export function countLines(directory: string, extension: string): {
total: number;
files: { path: string; lines: number }[];
} {
const cmd = `find ${directory} -name "*.${extension}" -not -path "*/node_modules/*" -not -path "*/.git/*"`;
const files = execSync(cmd).toString().trim().split('\n');
const results = files.map(file => ({
path: file,
lines: Number(execSync(`wc -l < ${file}`).toString().trim())
}));
return {
total: results.reduce((sum, r) => sum + r.lines, 0),
files: results.sort((a, b) => b.lines - a.lines)
};
}
Step 3: Write the Entry File
index.ts:
import type { ExtensionAPI } from '@earendil-works/pi-coding-agent';
import { countLines } from './lib/tools-stats';
export default function (pi: ExtensionAPI) {
pi.registerTool({
name: 'code_stats',
label: 'Code Stats',
description: 'Count lines of code in a project. Use when the user says "count code" or "how many lines of code".',
promptSnippet: 'Count lines of code in a project.',
parameters: {
type: 'object',
properties: {
directory: {
type: 'string',
description: 'The directory path to count'
},
extension: {
type: 'string',
description: 'File extension, e.g. ts, py, rs'
}
},
required: ['directory']
},
async execute(_toolCallId: string, params: any): Promise<any> {
const result = countLines(params.directory, params.extension || 'ts');
return {
totalLines: result.total,
fileCount: result.files.length,
topFiles: result.files.slice(0, 10)
};
}
});
}
Step 4: Write the Tool Description
prompts/stats-description.md:
Count lines of code in a project.
Parameters:
- directory (required): The directory path to count
- extension (optional): File extension, defaults to ts
Returns:
- totalLines: Total line count
- fileCount: Number of files
- topFiles: Top 10 largest files
Example:
code_stats(directory="src", extension="ts")
→ { totalLines: 12340, fileCount: 45, topFiles: [...] }
Step 5: Install and Test
// settings.json
{
"packages": [
"./path/to/pi-code-stats"
]
}
Restart pi, and the AI will be able to use the code_stats tool.
pi-shared-utils: Your Toolbox
When writing extensions, you don’t have to start from scratch every time. pi-shared-utils provides a set of common utility functions:
| Module | Function | When to Use |
|---|---|---|
logger | Unified logging format | When you need to print debug info |
storage | Cross-session persistent storage | When you need to save configuration or state |
paths | Unified path handling | When you need to find file locations |
json | Safe JSON read/write | When you need to manipulate JSON files |
validator | Parameter validation | When you need to validate tool parameters |
settings-backup | settings.json backup and rollback | When you need to safely write config |
file-lock | File locks (proper-lockfile wrapper) | When you need to prevent race conditions |
config | Three-layer config merging (defaults → global → project) | When your extension needs configurable parameters |
Usage Example
import { logger, storage, paths } from '@pi-atelier/shared-utils';
// Logging
logger.info('Extension activated');
logger.warn('Missing configuration file, using defaults');
// Paths
const projectRoot = paths.getProjectRoot();
const memoryDir = paths.getMemoryDir();
Configuration API Example
If your extension needs user-configurable parameters:
import { getEffectiveConfig } from '@pi-atelier/shared-utils';
const defaults = { threshold: 1000, enabled: true };
const config = getEffectiveConfig('my-extension', defaults, cwd);
// config = final configuration after three-layer merge
Debugging Your Extension
Common issues during extension development: the tool is registered but AI doesn’t call it, errors occur in the handler without visible logs, or the returned result isn’t what was expected.
Viewing Log Output
Output from logger.info() and console.log() in your extension appears in pi’s terminal window (not the chat window). Debugging steps:
# Start pi in the terminal (not in the background) to see all log output
pi
# Then ask the AI to call your tool in the chat window
# The terminal will display the log output
Confirming Tool Registration
In the pi chat, directly ask the AI:
What tools do you have available? Can you see code_stats?
If the AI can’t see your tool, check:
- Does
package.jsonhave"piExtension": true? - Is the package path in
settings.jsoncorrect? - Is the entry function exported correctly (
export default function(pi))?
Common Issue Troubleshooting
| Issue | Cause | Solution |
|---|---|---|
| AI can’t see the tool | Missing piExtension field | Add "piExtension": true to package.json |
| Tool call errors | Exception in handler | Check the error stack in terminal logs |
| AI doesn’t call the tool | Description is too vague | Make the tool description more specific, include parameter details and examples |
| Empty return value | Async operation not awaited | Add async to handler, add await to calls |
| Path not found | Relative path issues | Use paths.getProjectRoot() to get absolute paths |
💡 Tip: During extension development, you can add
console.log(JSON.stringify(args, null, 2))at the beginning of your handler to print the parameters and see what the AI is passing in.
Publishing Your Extension
Publishing to npm
# 1. Confirm package.json info is complete
npm version patch # 0.1.0 → 0.1.1
# 2. Publish
npm publish --access public
Installing After Publishing
Other users can add your package name to their settings.json:
{
"packages": [
"pi-code-stats"
]
}
Pre-Publishing Checklist
-
package.jsonhas a completedescriptionandkeywords -
README.mdis written following the template (installation, usage, configuration, examples) - Has a
LICENSEfile - Has a
CHANGELOG.md - Code has basic error handling
- Tool descriptions (prompts/*.md) are clear and complete
Extension Development Best Practices
✅ Good Extension Design
- Single Responsibility: One extension does one thing — don’t cram all functionality into a single package
- Description as Documentation: Write tool descriptions clearly enough that the AI doesn’t have to guess
- Parameter Validation: Validate parameters in the handler and provide meaningful error messages
- Idempotent Operations: Same input should produce the same output — avoid side effects
✅ Writing Good Tool Descriptions
# Good description
Count lines of code for a specific file type in a given directory.
Parameters:
- directory (required): directory path
- extension (optional): file suffix, defaults to "ts"
Returns: { totalLines, fileCount, topFiles }
# Bad description
Count code
❌ Common Mistakes
- Tool name too generic:
analyze→ should becode_stats - Description too brief: AI doesn’t know how to use it and will pass wrong parameters
- Forgetting error handling: crashes when file doesn’t exist
- Return value too large: returning the entire file content → should return a summary
Appendix: pi Extension API Quick Reference
registerTool
pi.registerTool({
name: string, // Tool name (unique identifier, AI uses this to call it)
label: string, // Display name (shown in TUI)
description: string, // Tool description (what AI sees, determines when AI calls it)
promptSnippet?: string, // Short description (injected into AI system prompt, if empty won't appear in Available tools)
promptGuidelines?: string[], // AI usage guidelines
parameters: TypeBox.Object({...}), // Parameter definition (TypeBox schema)
renderShell?: "default" | "self", // Render mode (default "default")
executionMode?: "sequential" | "parallel", // Execution mode
async execute(
toolCallId: string, // Tool call ID
params: any, // Parameters passed by AI
signal: AbortSignal | undefined, // Cancel signal
onUpdate: Function | undefined, // Streaming update callback
ctx: ExtensionContext // Execution context
): Promise<{ content: [{ type: "text", text: string }], details: any }>
});
registerCommand
pi.registerCommand(name: string, {
description: string, // Command description
getArgumentCompletions?: (prefix: string) => AutocompleteItem[], // Argument autocomplete
handler: async (args: string, ctx: ExtensionCommandContext) => {
// args: text entered by user (text after /command)
// ctx.ui.notify(message, level): Show notification
// ctx.compact(): Trigger compaction
// ctx.switchModel(model): Switch model
}
});
registerShortcut
pi.registerShortcut(shortcut: string, {
description: string,
handler: async (ctx: ExtensionContext) => { ... }
});
Event Listeners
// Session lifecycle
pi.on('session_start', (event, ctx) => { ... });
pi.on('session_shutdown', (event, ctx) => { ... });
pi.on('session_before_switch', (event, ctx) => { ... });
pi.on('session_before_fork', (event, ctx) => { ... });
pi.on('session_before_tree', (event, ctx) => { ... });
pi.on('session_tree', (event, ctx) => { ... });
// Compaction
pi.on('session_before_compact', (event, ctx) => { ... });
pi.on('session_compact', (event, ctx) => { ... });
// AI interaction
pi.on('before_provider_request', (event, ctx) => { ... });
pi.on('after_provider_response', (event, ctx) => { ... });
pi.on('context', (event, ctx) => { ... });
// Agent lifecycle
pi.on('before_agent_start', (event, ctx) => { ... });
pi.on('agent_start', (event, ctx) => { ... });
pi.on('agent_end', (event, ctx) => { ... });
pi.on('turn_start', (event, ctx) => { ... });
pi.on('turn_end', (event, ctx) => { ... });
// Messages
pi.on('message_start', (event, ctx) => { ... });
pi.on('message_update', (event, ctx) => { ... });
pi.on('message_end', (event, ctx) => { ... });
// Tool execution
pi.on('tool_call', (event, ctx) => { ... });
pi.on('tool_result', (event, ctx) => { ... });
pi.on('tool_execution_start', (event, ctx) => { ... });
pi.on('tool_execution_update', (event, ctx) => { ... });
pi.on('tool_execution_end', (event, ctx) => { ... });
// Other
pi.on('model_select', (event, ctx) => { ... });
pi.on('thinking_level_select', (event, ctx) => { ... });
pi.on('input', (event, ctx) => { ... });
pi.on('user_bash', (event, ctx) => { ... });
pi.on('resources_discover', (event, ctx) => { ... });
Helper Methods
pi.sendMessage(message, options?); // Send custom message to session
pi.appendEntry(role, content); // Append a message to the session
pi.registerFlag(name, options); // Register CLI flag
pi.getFlag(name); // Get CLI flag value
pi.registerMessageRenderer(type, renderer); // Register custom message renderer
Congratulations, You’ve Made It!
Now you understand all the core concepts of pi-atelier:
- Memory (pi-memory) — Let AI remember knowledge
- Planning (pi-roadmap) — Let AI manage tasks
- Rules (pi-shepherd + pi-context-manager) — Let AI follow rules and control information quality
- Retrospective (pi-session-analyzer + pi-journal) — Let AI record and review work
- Compression & Diagnostics (pi-smart-compact + pi-context-manager) — Keep AI smart in long sessions
- Automation (pi-scheduler + pi-workflow) — Let AI work proactively
- Extensions (pi-shared-utils + your own extensions) — Make AI capable of anything
Feel free to submit Issues and PRs on GitHub — let’s make the AI coding assistant better together!
Appendix
A. Extension Quick Reference
| Extension | Install Command | Core Tools/Commands | One-Liner Purpose |
|---|---|---|---|
| pi-memory | "pi-memory" | memory_update, memory_index | Cross-session knowledge persistence |
| pi-roadmap | "pi-roadmap" | roadmap_plan, roadmap_next, roadmap_done, roadmap_search, roadmap_update, etc. | Task breakdown, progress tracking, dependency management |
| pi-shepherd | "pi-shepherd" | shepherd_rules, Rule-driven hook engine | AI behavior guard |
| pi-context-manager | "pi-context-manager" | payload_analyze, /record, /context, /distill-config, /aging-config, etc. | Context quality control + Token diagnostics |
| pi-session-analyzer | "pi-session-analyzer" | session_search, session_analyze | Historical session search and review |
| pi-smart-compact | "pi-smart-compact" | /smart-compact, /smart-compact-config | Intelligent long-session compression |
| pi-scheduler | "pi-scheduler" | schedule, /loop, /remind, /tasks | Scheduled tasks and reminders |
| pi-workflow | "pi-workflow" | registerWorkflowTool (called by other extensions) | Workflow framework library |
| pi-shared-utils | "pi-shared-utils" | logger, storage, paths, json, validator, settings-backup, file-lock | Extension development utility library |
| pi-journal | "pi-journal" | /journal, journal | Log report generation (git activity + session events + memory changes) |
B. Recommended Extension Combos
Personal Projects (Lightweight Combo)
{
"packages": [
"pi-memory",
"pi-roadmap",
"pi-smart-compact"
]
}
Core three: Remember knowledge + Manage tasks + Stay smart in long sessions.
Team Projects (Standard Combo)
{
"packages": [
"pi-memory",
"pi-roadmap",
"pi-shepherd",
"pi-session-analyzer",
"pi-smart-compact"
]
}
Adds rules and retrospective capabilities.
Large Refactors (Full Combo)
{
"packages": [
"pi-memory",
"pi-roadmap",
"pi-shepherd",
"pi-context-manager",
"pi-session-analyzer",
"pi-smart-compact",
"pi-scheduler"
]
}
Full installation, fully leveraging diagnostics and automation capabilities.
C. pi Internal Mechanics Overview
Compaction
pi has a built-in context compression mechanism. When the conversation history approaches the context window limit, pi automatically compresses older conversations. The Smart Compact extension enhances this mechanism — it identifies critical information (decisions, conventions, conclusions) and prioritizes preserving it.
Distill
Tool results can be very large (e.g., reading a 1000-line file). pi has a built-in distill mechanism to compress tool output. The pi-context-manager extension provides:
- Auto Distill: Automatically compresses tool output exceeding the threshold (
/distill-config) - First Full Content Cap:
firstSeenCap(/distill-config --cap) limits the initial output size - Tool Result Processor: Format-specific streamlining for certain tool types (
/processor-config) - Aging: Automatically evicts old tool output (
/aging-config)
Tool Call Lifecycle
1. AI decides to call a tool
│
▼
2. Shepherd tool_call hook (rewrite / block / notify / steer)
│
▼
3. Execute tool
│
▼
4. Context Manager distill + processor processes the return value
│
▼
5. Shepherd tool_result hook (notify / steer)
│
▼
6. Result returned to AI
│
▼
7. AI generates reply
│
▼
8. Shepherd message_end hook (steer — matches AI reply text)
Session Storage
All session data is stored under the ~/.pi/ directory:
~/.pi/
├── roadmap/ # Global roadmaps
└── agent/
├── settings.json # Global config (installed extensions, providers)
├── mcp.json # MCP server configuration
├── memory/ # Global memory files (L1)
├── skills/ # Global skills
├── extensions/ # Inline extensions
├── agents/ # Sub-agent definitions
├── npm/node_modules/ # npm-installed extension packages
├── git/ # Git package installation location
├── sessions/ # Session history records (JSONL)
├── distill/ # context-manager data
│ └── recordings/ # Payload recordings
{project}/.pi/
├── settings.json # Project-level config (overrides global)
├── memory/ # Project-level memory (L2)
└── roadmap/ # Project-level roadmaps
D. Frequently Asked Questions
Q: Extension not taking effect after installation?
Check:
- Whether
settings.jsonformat is correct (JSON syntax) - Whether the package name is spelled correctly
- Restart pi (extensions need a restart to be loaded)
Q: Too many memory files?
pi-memory automatically checks the file count. It’s recommended to clean up when exceeding 25 files; writes are refused beyond 40. Cleanup methods:
- Merge multiple files on the same topic
- Delete outdated memories
- Split large files into smaller ones
Q: Shepherd rules not working?
Check:
- Global rules are in the pi-shepherd package’s
rules.json - Project rules go in
.pi/shepherd-rules-*.json(note the file name prefix) - Confirm
"enabled": truein the rule
Q: Token consumption too fast?
- Use
payload_analyzewithbudgetandexpensivemodes to identify token hogs - Use compact mode for searches (
semantic_code_search(compact: true)) - Lower the distill threshold (
/distill-config) - Configure aging to auto-evict old content (
/aging-config)
Q: payload_analyze reports “no recordings”?
You need to enable recording first: /record on. Use normally while recording, then /record off when done.