Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

pi-atelier: Making AI Coding Assistants Professional

Learn how to use the pi-atelier extension ecosystem to evolve your AI coding assistant from "can write code" to "can manage projects"

Who Is This Book For?

This book is for you if you’re doing any of the following:

  • Writing code with AI coding assistants (pi, Cursor, Copilot, etc.)
  • Feeling like your AI assistant is “almost” good enough
  • Wanting to evolve AI from a “Q&A tool” into a “project partner”

What Is pi-atelier?

pi-atelier is a set of pi extensions that give AI coding assistants project management capabilities.

A regular AI assistant can write code, but:

  • It forgets everything between sessions
  • It tends to go off-track on large tasks
  • It has no rules, making silly mistakes easily
  • It gets dumber as the conversation grows longer

pi-atelier extensions fill these gaps:

🧠 Memory

Let AI retain knowledge across sessions

📋 Planning

Manage three-tier roadmaps: Epic → Story → Task

🛡️ Shepherd

Set rules for AI to prevent mistakes

🔍 Diagnostics

Control context quality + token consumption analysis

📊 Analysis

Search and revisit historical sessions

🗜️ Compression

Keep AI sharp in long sessions

For a detailed comparison, see the table below:

CapabilityExtensionOne-Line Description
Memorypi-memoryLet AI retain knowledge across sessions
Planningpi-roadmapLet AI manage Epic → Story → Task
Shepherdpi-shepherdSet rules for AI to prevent mistakes
Context & Diagnosticspi-context-managerControl what AI sees + token consumption diagnostics
Journalpi-journalGenerate log reports (git activity + session events + memory changes)
Analysispi-session-analyzerSearch and revisit historical sessions
Compressionpi-smart-compactKeep AI smart in long sessions
Schedulingpi-schedulerTimed reminders and recurring tasks
Workflowpi-workflowSub-agent orchestration, parallel execution
Tool Librarypi-shared-utilsCommon utility functions for extension development

Reading Path

Quick Start Path (1 hour)

  1. Chapter 1: An AI’s Memory → Install pi-memory in 5 minutes
  2. Chapter 2: From Memory to Planning → Learn to manage tasks with roadmaps
  3. Chapter 7: Build Your Own Extension → Understand the extension mechanism

Comprehensive Path (3 hours)

Read all chapters in order. Each chapter includes:

  • Pain Point: Real problems you will definitely encounter
  • How It Works: How the extension works internally
  • Use Cases: Real-world scenarios
  • Best Practices: How to use it better

On-Demand Reference

When facing a specific problem, jump directly to the relevant chapter. Each chapter is self-contained.

Quick Install

Add the extensions you need to pi’s settings.json:

{
  "packages": [
    "pi-memory",
    "pi-roadmap",
    "pi-shepherd",
    "pi-context-manager",
    "pi-session-analyzer",
    "pi-smart-compact",
    "pi-scheduler"
  ]
}

Or install everything (pi-workflow and pi-shared-utils are development libraries; regular users don’t need to install them directly):

{
  "packages": [
    "pi-memory",
    "pi-roadmap",
    "pi-shepherd",
    "pi-context-manager",
    "pi-session-analyzer",
    "pi-smart-compact",
    "pi-scheduler",
    "pi-workflow",
    "pi-shared-utils"
  ]
}

Most extensions are ready to use out of the box — no additional configuration needed after installation (though you can customize as needed).

💡 Tip: pi-workflow and pi-shared-utils are development libraries used by other extensions; regular users generally don’t need to install them directly.

Important File Paths

Before you start, here are the key pi files you need to know:

FilePathDescription
Global Config~/.pi/agent/settings.jsonInstall extensions, configure providers
Project Config.pi/settings.json (project root)Project-level custom configuration (overrides global)
Project InstructionsAGENTS.md (project root or .pi/agent/)Project rules injected into the AI
Extension Install Dir~/.pi/agent/npm/node_modules/npm package installation location
Memory Directory.pi/memory/ (project-level)Project-level persistent memory
Global Memory~/.pi/agent/memory/Cross-project general memory

💡 Newcomer Tip: ~ refers to your home directory. On macOS/Linux it’s /home/your-username/, on Windows it’s C:\Users\your-username\.

Conventions

Examples in this book follow these conventions:

  • Code blocks: Commands, file paths, code snippets
  • Bold: Important concepts
  • 💡 Tip: Practical tips and notes

  • Tables: Quick comparisons and reference

Ready to get started? Flip to Chapter 1, and let’s begin with “memory.”

An AI’s Memory

You’ve Probably Seen This Before

You’re using an AI coding assistant to develop a project. On the first day, you spent half an hour explaining to the AI:

  • The project uses Rust + Axum tech stack
  • The database is DuckDB, not PostgreSQL
  • The auth module uses JWT, not Session
  • The deployment target is an ARM-based embedded device

The AI understood, and helped you write perfect code.

The next day, you start a new session and the AI has forgotten everything. It starts suggesting Express.js, connecting to PostgreSQL, using Session auth…

💡 This is the AI’s “goldfish memory” problem: every new session is a blank slate.

Memory: Giving AI Cross-Session Knowledge

pi-memory is the solution. It gives the AI a “notebook”:

  • Auto-record: Architectural decisions, pitfalls encountered, and consensus reached during the session
  • Auto-load: At the start of each new session, the AI automatically reads key knowledge from before
  • Per-project isolation: Memories from different projects don’t interfere with each other

How It Works

┌─────────────────────────────────────────┐
│           AI Session                      │
│                                          │
│  ┌──────────┐     ┌──────────────────┐  │
│  │ User      │ ──→ │ AI extracts      │  │
│  │ Dialogue  │     │ knowledge points │  │
│  └──────────┘     └────────┬─────────┘  │
│                            │             │
│                            ▼             │
│                   ┌─────────────────┐    │
│                   │  memory_update   │    │
│                   │  Write memory    │    │
│                   │  file            │    │
│                   └─────────────────┘    │
│                                          │
│  ┌──────────┐     ┌──────────────────┐  │
│  │ New      │ ──→ │ before_agent_start│  │
│  │ Session  │     │ Auto-load memory │  │
│  │ Start    │     │ index            │  │
│  └──────────┘     └──────────────────┘  │
└─────────────────────────────────────────┘

Auto-Injection: Memory Index Loaded Every Turn

At the start of each session (the before_agent_start event), pi-memory automatically does the following:

  1. Reads memory-prompt.md — Instructions for using the memory system (tells the AI there’s a memory feature and where the files are)
  2. Reads the MEMORY.md index — Global ~/.pi/agent/memory/MEMORY.md + project .pi/memory/MEMORY.md
  3. Injects into the system prompt — The AI can see all memory titles and keywords right at the start of every turn

This means the AI doesn’t need to actively “look up” memories — the memory index is already in its context. When the AI sees a title like JS_replace_$陷阱, it knows that memory exists and can use the read tool to get the full content when needed.

⚠️ Only the index is injected, not the full content. MEMORY.md only contains titles and keywords, not complete memory content. The AI needs to read a specific file to get the details.

The core structure of memory:

ComponentRole
MEMORY.mdIndex file, lists all memory titles and keywords (auto-injected every turn)
Memory files (.md)One topic per file, contains specific knowledge (read on demand)
memory_update toolAI uses to write/update memory files + auto-update MEMORY.md index
memory_index toolAI uses to view existing memories (manual query)
before_agent_start hookAuto-injects memory index into system prompt at session start

Memory File Format

Each memory file follows a consistent format:

# Title

Keywords: `kw1` `kw2` `kw3` ...

## Content

- Knowledge point 1
- Knowledge point 2
- Decision record: why option A was chosen over option B

File naming convention: topic--keyword1,keyword2,keyword3.md

For example: database-choice--DuckDB,embedded,ARM,column-store,analytical-queries.md

Real-World Case: The First Week of a New Project

Let’s look at a real scenario. Say you’re developing a data analysis tool:

Day 1: Project Initialization

You and the AI discussed tech choices and decided on Python + FastAPI + DuckDB. At the end of the session, the AI automatically wrote a memory:

# Tech Stack Decision

Keywords: `Python` `FastAPI` `DuckDB` `technology-choice`

## Rationale

- FastAPI: Good async support, auto-generates API documentation
- DuckDB: Embedded analytical database, no separate deployment needed, suitable for single-machine analysis
  - Not PostgreSQL: the project doesn't need concurrent writes; embedded is simpler
- Python 3.12+: uses new type syntax

Day 3: Hitting a Pitfall

You ran into an issue with DuckDB’s date handling — the default timezone is UTC, but your users are in China. After the AI helped you solve it, it wrote a memory:

# DuckDB Timezone Issue

Keywords: `DuckDB` `timezone` `date` `UTC` `Asia/Shanghai`

## Problem and Solution

DuckDB uses UTC timezone by default. Running `SELECT NOW()` returns UTC time.

Solution: Set `SET timezone = 'Asia/Shanghai'` when connecting.

Note: Don't do timezone conversion at the SQL level — handle it in the Python layer with datetime for reliability.

Day 7: New Session, No Repetition Needed

You start a new session and say “add a CSV export feature to the analysis API.” The AI already knows:

  • The project uses FastAPI + DuckDB
  • Timezone is Asia/Shanghai
  • Database queries are handled in the Python layer

You don’t need to explain it all again.

This is the value of memory: it saves the 30 minutes you’d otherwise spend re-explaining every time.

Best Practices

✅ What to Remember

  • Technical decisions: Why you chose A over B
  • Pitfalls encountered: The problem and the solution
  • Project conventions: Naming conventions, directory structure, deployment method
  • Architecture knowledge: Module relationships, data flow, key interfaces

❌ What Not to Remember

  • Temporary debugging info (“this variable’s value is 42”)
  • Outdated conclusions (remember to clean up periodically)
  • General programming knowledge (the AI already knows how to write a for loop)

Memory File Management

More memory isn’t always better. The system has multi-layer protection:

ThresholdBehavior
20 filesPrompt to watch the count
25 filesWarning: approaching limit, suggest cleanup/merge
40 filesHard rejection on writes — must clean up first

Cleanup methods:

  1. Merge: Combine multiple files on the same topic into one
  2. Archive: Outdated conclusions replaced by new ones — delete the old
  3. Split: When a single file exceeds 200 lines, split by subtopic

Additionally, the memory system has conflict detection — if a new file has the same topic or overlaps on 3+ keywords with an existing memory, the write is rejected outright, forcing the conflict to be resolved first (merge or overwrite).

Configuration

Install pi-memory via pi’s settings.json:

{
  "packages": [
    "pi-memory"
  ]
}

No additional configuration needed — ready to use on install. Memory files are stored in the project’s .pi/memory/ directory.

Memory Scope

ScopePathUsage
Project-level.pi/memory/Project-specific knowledge (architecture, decisions, pitfalls)
Global~/.pi/agent/memory/Cross-project general knowledge (toolchain, coding discipline)

Advanced Scenarios: Memory Cleanup and Knowledge Evolution

Scenario: Memory Fragmentation

After a month of use, memory files pile up:

.pi/memory/
├── database-choice--DuckDB,embedded,ARM.md
├── db-choice--database,DuckDB,performance.md       ← Duplicate of above!
├── deployment-issue--Docker,ARM,memory.md
├── deployment-issue2--Docker,memory,OOM.md         ← Also duplicate!
├── auth-bug--JWT,expiry,refresh.md
├── fastapi-cors--CORS,FastAPI,cross-origin.md
├── test-tricks--vitest,mock,testing.md
├── ... (20 more files)

Before each write, the AI automatically runs memory_index to check. If it finds existing memories on the same topic, it merges first before writing. But if fragmentation has already happened, manual cleanup is needed.

Cleanup steps:

  1. Ask the AI to run memory_index to view all current memories
  2. Mark files on the same topic (3+ overlapping keywords)
  3. Ask the AI to read the marked files and merge them into one
  4. Use memory_update to overwrite an existing filename (not a new one, or conflict detection will reject the write)
  5. Delete the old fragmented files

Scenario: Old Conclusions Overturned

A memory written last month says “use Express.js for the backend,” but this month the project decided to migrate to FastAPI. Every time the AI reads the old memory, it thinks in Express terms and gives bad suggestions.

Solution: Use memory_update to overwrite the old file with the new conclusion:

# Backend Framework Decision

Keywords: `FastAPI` `Python` `migration`

## Current Decision

2026-05: Migrated from Express.js to FastAPI.

## Migration Reasons

- Needed better async support
- Python's data analysis ecosystem is richer
- Express.js version has been archived and is no longer updated

> ⚠️ Old conclusion (deprecated): Use Express.js + TypeScript

Key point: Don’t just delete the old file — clearly document both the new conclusion and how it relates to the old one, so the AI doesn’t “reinvent” the old approach in other contexts.

Scenario: Cross-Project Knowledge Reuse

You hit a pitfall in project A and want to make sure project B doesn’t repeat the same mistake.

Solution: Write the general knowledge to global memory:

~/.pi/agent/memory/
└── npm-file-ref-traps--npm,file-ref,node_modules,cache.md

Global memory is visible to all projects. This way, no matter which project you’re in, the AI will know “npm file: references have cache traps.”

Principle:

  • Project-specific knowledge (this project’s architecture, conventions) → project-level .pi/memory/
  • General experience (toolchain pitfalls, coding discipline) → global ~/.pi/agent/memory/

Next Steps

Now the AI has memory and can retain knowledge across sessions. But when faced with a large project, does it know what to do first, and what to do next?

In the next chapter, we’ll look at how to teach the AI planning.

From Memory to Planning

You’ve Probably Encountered This

You ask an AI to do a large task like “migrate the project from JavaScript to TypeScript.”

Everything goes well for the first 30 minutes — the AI migrates configuration files, type definitions, and core modules as planned. But by the 5th file, the AI starts to “drift”:

  • It forgets earlier conventions and starts using a different naming style
  • It skips migrating test files
  • It begins a “refactor while you’re at it” that you never asked for
  • When you remind it to get back on track, it can’t remember what the first 3 steps of the original plan were

💡 Goldfish memory is solved, but there’s still an “attention deficit” problem: The AI can remember knowledge, but it can’t manage tasks.

Roadmap: Teaching AI to Manage Complex Tasks

pi-roadmap gives the AI a “project management brain”:

  • Structured breakdown: Decomposes large goals into Epic → Story → Task three layers
  • Progress tracking: Each task has a clear status (todo / doing / done / blocked)
  • Persistence: Roadmaps are saved in files, so new sessions can continue where you left off
  • Priority sorting: Automatically recommends what to do next

Why a Three-Layer Structure?

Epic (Big direction)
 └── Story (Deliverable work chunk)
      └── Task (Smallest executable unit)

This structure comes from agile development practices, with some adaptations:

ConceptTraditional Agilepi-roadmapRationale
Epic2-8 weeksA complete project directionAI sessions don’t span weeks
Story1-3 daysCompletable in 1-3 sessionsAdapted to AI’s working rhythm
Task0.5-1 day30 min - 2 hoursGranularity AI can focus on at once

How It Works

User describes goal
     │
     ▼
┌──────────────────────────────────┐
│         roadmap_plan             │
│  AI analyzes goal → breaks into  │
│  three-layer structure           │
│  compares with existing roadmap  │
│  → incremental update            │
└──────────────┬───────────────────┘
               │
               ▼
┌──────────────────────────────────┐
│  ~/.pi/roadmap/<id>.roadmap.json │
│  Global storage, cross-session   │
│  and cross-project access        │
│  + Project-level                 │
│    .pi/roadmap/roadmap.json      │
│    (auto-synced derivation)      │
└──────────────┬───────────────────┘
               │
     ┌─────────┼─────────┐
     ▼         ▼         ▼
 roadmap_list  roadmap_show  roadmap_next
  List all     Show detail   Recommend next

Real-World Example: Migrating a Multi-Package Project

Let’s look at a real scenario — upgrading documentation across 12 npm packages simultaneously:

Step 1: Create a Roadmap

You say: “Help me plan documentation work for all packages.”

The AI calls roadmap_plan, which automatically decomposes:

{
  "roadmapId": "package-docs",
  "title": "Package Documentation Upgrade",
  "epics": [
    {
      "id": "E0",
      "title": "Define template and validate",
      "stories": [
        {
          "id": "E0.S0",
          "title": "Analyze best practices, distill template",
          "tasks": [
            { "id": "E0.S0.T0", "title": "Research GitHub documentation standards" },
            { "id": "E0.S0.T1", "title": "Distill README template" },
            { "id": "E0.S0.T2", "title": "Validate template with first package" }
          ]
        }
      ]
    },
    {
      "id": "E1",
      "title": "Batch upgrade all packages",
      "stories": [
        { "id": "E1.S0", "title": "Core extensions (4 packages)" },
        { "id": "E1.S1", "title": "Tool extensions (4 packages)" },
        { "id": "E1.S2", "title": "Utility extensions (4 packages)" }
      ]
    }
  ]
}

Step 2: Advance According to Plan

In each new session, you say “continue.” The AI calls roadmap_next:

📊 Recommended next task:

E0.S0.T1 — Distill README template (high priority)
  Part of: E0 Define template and validate > S0 Analyze best practices

Start?

Step 3: Mark Completion

After completing a task, the AI calls roadmap_done:

✅ E0.S0.T1 Completed
    Output: templates/README-template.md

Step 4: Encountering Blockers

If source code for a package is missing, the AI can mark the task as blocked:

⚠️ E1.S1.T3 Blocked
    Reason: pi-journal's API documentation is incomplete; source code comments need to be added first

Roadmap vs Memory: What’s the Relationship?

DimensionMemory (pi-memory)Roadmap (pi-roadmap)
What it storesKnowledge, decisions, pitfallsTasks, progress, plans
GranularityFree-form textStructured JSON
Query methodKeywordsStatus / priority
LifecycleLong-term retentionCan be archived when project ends

Simply put:

  • Memory is “what I know”
  • Roadmap is “what I need to do and how far along I am”

The two complement each other: memory helps the AI remember knowledge, the roadmap helps the AI remember tasks.

Best Practices

✅ Good Epic Breakdown

Epic: Publish npm package
  Story: Prepare release environment
    Task: Configure package.json exports field
    Task: Add bundledDependencies configuration
    Task: Configure tsconfig declaration output
  Story: Write documentation
    Task: Complete README.md
    Task: Add CHANGELOG.md

❌ Bad Breakdown

Epic: Do everything                     ← Too vague, no direction
  Story: Do the first step              ← Doesn't say what to do
    Task: Start working                  ← Not actionable

Golden Rules of Breakdown

  1. Epic titles should be verb phrases: “Publish npm package” instead of “npm”
  2. Stories should have clear deliverables: “Complete README” instead of “Write docs”
  3. Tasks should be executable within 30 minutes: “Configure package.json name field” instead of “Configure build”
  4. Items at the same level should have the same granularity: Don’t have one Story with 2 tasks and another with 20

Advanced Scenarios: Plan Adjustment & Progress Tracking

Scenario: When Direction Needs to Change

Plans change. The roadmap you laid out yesterday may no longer fit today’s requirements. You don’t need to start over — just update with roadmap_plan:

You: Yesterday's refactoring plan is too big. I want to start with just the auth module.

AI calls roadmap_plan(action="update"):
  → Compares current roadmap with your new requirements
  → Keeps completed tasks untouched
  → Marks unnecessary tasks as dropped
  → Adds new tasks

Key principle: roadmap_plan is incremental, not overwriting. Tasks already marked done are never rolled back.

Scenario: Tracking Who Did What

In multi-session collaboration, you often wonder “which session completed this task?” The roadmap tracks this automatically:

roadmap_show(roadmapId="package-docs")

Result:
  E0.S0.T0 Research GitHub documentation standards ✅ by: 8740-8fce3e7af232
  E0.S0.T1 Distill README template ✅ by: b8b5-85516ead6253
  E0.S0.T2 Validate template with first package ✅ by: b8b5-85516ead6253
  E1.S0.T0 Core extension - pi-shepherd 🔄 doing by: aa55-a4860e851afb

The by: xxxx-xxxxxxxxxxxx suffix after each completed task is the short form of the session ID (last two segments of the UUID). You can use this ID to search for the specific session:

session_search(action="grep", query="8740-8fce3e7af232")

→ Find the session, then use session_analyze(action="summary") to view details

Scenario: Archiving Completed Epics

When a project is finished, you don’t want completed Epics cluttering your view:

roadmap_archive(roadmapId="package-docs")

→ Auto-archives all completed Epics
→ Hidden by default, view with show_archived=true

Scenario: Not Sure What to Do Next

When you open pi and have no idea what to continue:

roadmap_next()

Result:
  📊 Recommended next task (sorted by priority):
  
  1. E1.S0.T3 — Configure package.json files whitelist (high, todo)
  2. E1.S1.T0 — Tool extension - pi-roadmap (medium, todo)
  3. E2.S0.T0 — Research mdBook theme customization (low, todo)

roadmap_next automatically sorts by doing → todo, high → medium → low, telling you exactly what deserves your attention.

When you have many roadmaps, roadmap_search lets you quickly find tasks by keyword, searching across title + description + note:

You: Have we planned anything related to CI?

AI calls roadmap_search(query="CI")

→ Found matches across roadmaps, with full hierarchical context

Task Dependencies: dependsOn

Complex projects often have sequential dependencies — B can only start after A is done. dependsOn lets you declare dependencies on Stories and Tasks:

{
  "title": "Configure CI pipeline",
  "status": "todo",
  "dependsOn": ["E1.S1.T3"]
}

roadmap_show automatically displays dependency relationships:

E1.S2 Publish workflow
  E1.S2.T1 Configure CI pipeline 📋 depends: [E1.S1.T3]
  E1.S2.T2 Publish to npm registry 📋 depends: [E1.S2.T1]

📖 For adding and updating dependencies, see pi-roadmap README.

Protection Mechanisms

  • Cannot add child items to archived/completed Epics/Stories — prevents appending tasks to already completed work
  • Duplicate ID checkingroadmap_update verifies that dependsOn IDs actually exist

Next Steps

The AI now has memory (to remember knowledge) and a roadmap (to manage tasks). But sometimes, the AI still “makes mistakes” — modifying files it shouldn’t, using approaches it shouldn’t.

In the next chapter, we’ll look at how to set rules for the AI.

Setting Rules for AI

You’ve Probably Seen This Before

You ask the AI to “fix the login page styles.” 30 seconds later you check the code —

The AI didn’t just fix the styles. It also:

  • “Conveniently” refactored the entire login component directory structure
  • Switched CSS modules to Tailwind (your project doesn’t use Tailwind)
  • Deleted 3 test files it deemed “unnecessary”
  • Upgraded all dependencies in package.json to the latest versions

By the time you notice, the code has already been committed.

💡 The more capable the AI, the more it needs rules. Without boundaries, greater capability only causes greater damage.

Two Lines of Defense: Shepherd and Context

pi-atelier provides two layers of protection:

First Line: pi-shepherd — The Behavior Guard

Shepherd is a rule-driven event hook engine that checks AI actions before and after key moments — think of it as a security guard.

AI about to execute an action (tool call)
     │
     ▼
┌──────────────────────────────────┐
│     Shepherd tool_call hook      │
│   Check: Should it be done?      │
│          How should it be done?  │
└──────┬───────────────────────────┘
       │
   ┌───┴────┐
   │        │
  Allow   Rewrite/Block + Show Reason

... tool executes ...

┌──────────────────────────────────┐
│    Shepherd tool_result hook     │
│    Check: Any follow-up needed?  │
└──────┬───────────────────────────┘
       │
   Inject reminder / Append action

Supported hook timings:

HookWhen It TriggersTypical Use Case
tool_callBefore AI calls a toolRewrite commands, block dangerous operations
tool_resultAfter tool executionAuto-remind to run tests, lint checks
message_endAfter AI finishes replyingMatch AI response text, intercept wrong guesses
agent_endWhen AI finishes a conversationRemind to commit code, update memory
session_shutdownWhen a session closesClean up temporary data

Shepherd’s four actions:

ActionEffectTypical Use Case
blockPrevents tool executionBlock dangerous operations
notifyInjects a reminder into AI context“You edited a TS file, remember to run tests”
steerSilently injects guidance (not visible to user)Guide the AI to consult documentation
rewriteModifies tool call parametersAuto-prepend prefix to commands

Second Line: pi-context-manager — Information Quality & Diagnostics

Context Manager controls what information the AI sees, and also helps you diagnose token consumption issues.

Core capabilities:

  • Distill: Automatically compresses large tool outputs, preserving key information
  • Tool Result Processor: Formats and simplifies output from specific tools
  • Aging: Automatically evicts old tool outputs that haven’t been referenced in a while
  • Payload Analysis: Diagnoses where tokens are being spent with data
Tool returns large output (potentially 50KB)
     │
     ▼
┌────────────────────────┐
│   Context Manager       │
│   Distill + Processor   │
│   Compress to ~5KB      │
│   key information       │
└────────────────────────┘
     │
     ▼
AI sees refined information and makes better decisions

For detailed principles, see 3.3 Context Manager Deep Dive.

Real-World Examples: Preventing AI Mistakes

Scenario 1: Auto-Remind to Run Tests After Edit

{
	"comment": "[TypeScript] Must run tests after editing",
	"hook": "tool_result",
	"tool": "edit",
	"action": "notify",
	"conditions": [
		{ "field": "path", "pattern": "\\.ts$", "flags": "" }
	],
	"reason": "Edited a TypeScript file. You must run unit tests covering this code (add tests if none exist) and fix all test issues to ensure they pass.",
	"enabled": true
}

When the AI edits a .ts file, Shepherd automatically reminds the AI to run tests.

Scenario 2: Session-End Reminder to Commit Code

{
	"comment": "[Wrap-up] Remind to commit + update memory + summary after edits",
	"hook": "agent_end",
	"action": "notify",
	"conditions": [{ "builtin": "has_edits" }],
	"reason": "Detected file edits. Perform wrap-up:\n1️⃣ Git commit...\n2️⃣ Update memory...\n3️⃣ Session summary",
	"stopReason": ["stop"],
	"enabled": true
}

conditions: [{ builtin: "has_edits" }] means it only triggers when the session actually edited files. stopReason: ["stop"] means it only triggers when the AI ends normally (not when interrupted).

Scenario 3: Auto-Rewrite Commands

{
	"comment": "[rtk] Auto-proxy frequent bash commands",
	"tool": "bash",
	"action": "rewrite",
	"pattern": "^(git\\s+(status|log|diff)|cargo\\s+(test|build|clippy)|pytest)\\b",
	"flags": "",
	"reason": "rtk command rewrite: auto prepend rtk prefix to compress output",
	"enabled": true
}

When the AI tries to run commands like git status, Shepherd automatically rewrites it as rtk git status (rtk is an output compression tool).

Scenario 4: Code Style Check

{
	"comment": "[TS] No space indentation - TS files must use Tab",
	"hook": "tool_call",
	"tool": "edit",
	"action": "notify",
	"conditions": [
		{ "field": "path", "pattern": "\\.ts$", "flags": "" },
		{ "field": "text", "pattern": "\\n  [\\S ]", "flags": "" }
	],
	"reason": "❌ TS files require Tab indentation, not spaces. Please rewrite the code using Tab indentation.",
	"enabled": true
}

Both conditions must be met to trigger: the file is .ts and the code contains space indentation.

Scenario 5: Remind to Check Memory After Repeated Errors

{
	"comment": "[debug] Remind to check memory when tools repeatedly fail",
	"hook": "tool_result",
	"action": "steer",
	"state": { "countKind": "errors", "gte": 5 },
	"reason": "🔍 **Tools repeatedly failing**: Multiple consecutive failures. Check memory files under .pi/memory/ to see if there are existing records of this pitfall.",
	"enabled": true,
	"subagent": false
}

state implements state tracking — Shepherd remembers the error count and only triggers when it reaches the threshold. subagent: false means this rule does not trigger in sub-agents.

Shepherd Rule Configuration Reference

Rule File Locations

LevelPathDescription
Global defaultrules.json inside the extension packageBuilt-in rule set for Shepherd
Global custom~/.pi/agent/extensions/shepherd/rules.jsonGlobal custom rules (managed via the shepherd_rules tool)
Project-level.pi/shepherd-rules-*.json (project root)Custom project rules, can create multiple files

Rule file changes take effect immediately — no action needed.

Shepherd Rule Management

Rules can be managed in two ways:

  • shepherd_rules tool: Have AI safely add, update, or delete rules — includes write validation and rollback
  • Direct JSON editing: Global rules at ~/.pi/agent/extensions/shepherd/rules.json, project rules at .pi/extensions/shepherd-rules.json

Rule files take effect immediately after modification, no restart needed.

📖 For complete field reference and parameter docs, see pi-shepherd README.

Typical usage:

# Ask AI to add a rule for you
You: Add a rule to run cargo clippy after editing .rs files

AI calls shepherd_rules(action="add", rule={...})

# View project-level rules
AI calls shepherd_rules(action="list", scope="project")

Configuring Context

pi-context-manager provides the following commands:

CommandPurpose
/record [on|off]Toggle payload recording
/contextTUI panel: visualize context usage
/distill-config [N]View/set distill token threshold
/distill-config --cap [N]View/set first-seen full text cap (firstSeenCap, 0 = no cap)
/processor-config [N|off]View/set tool-result-processor threshold
/aging-config [N|off]View/set aging eviction rounds
/context-clean [sessionId]Clean up persistent data

💡 All commands show current config and usage when called without arguments. For example, entering /distill-config directly displays the current threshold and usage instructions.

For detailed principles, see 3.3 Context Manager Deep Dive.

Best Practices

✅ Good Rule Design

  • Precise conditions: Use conditions to narrow the trigger scope — don’t use a sledgehammer
  • Clear messaging: Tell the AI “why it’s not allowed” and “what to do instead”
  • Layered protection: Use block (enforced) for important matters, notify (advisory) for minor ones, steer (silent) for internal guidance
  • Make use of state tracking: Reminding after 3 consecutive errors is more effective than reminding every single time

❌ Bad Rule Design

  • Too frequent: notify on every tool call — the AI would be flooded with reminders
  • Too draconian: deny all bash commands — the AI can’t even run ls
  • Vague messaging: "reason": "Caution" — caution about what?
  • Ignoring sub-agents: Some rules should use "subagent": false to exclude sub-agent scenarios and avoid interfering with independent tasks

Rule Priority

When multiple rules match simultaneously:

  1. block > notify > steer (block > remind > silent guidance)
  2. At the same priority, rules execute in definition order within the rule file
  3. In the agent_end hook, rules whose check condition is not met are skipped

Next Up

With memory, planning, and rules in place, the AI is already a reliable assistant. But after a session accomplishes many things — how do you know exactly what it did? Which files were changed? What decisions were made?

In the next chapter, we’ll look at how to teach the AI to review its own work.

3.2 How pi-shepherd Works: A Rule-Driven Hook System

Shepherd is the “nervous system” of pi-atelier — it doesn’t provide tools or commands directly, but connects all other extensions through event hooks.

Architecture Overview

pi event bus
     │
     ├─ before_provider_request  ← Shepherd injects ephemeral hints here
     │
     ├─ tool_call                ← Shepherd intercepts/rewrites tool calls
     │      │
     │      ▼
     │   Tool executes
     │      │
     │      ▼
     ├─ tool_result              ← Shepherd checks results, triggers follow-up actions
     │
     ├─ agent_end                ← Shepherd triggers wrap-up actions
     │
     └─ session_shutdown         ← Shepherd cleans up ephemeral state

Core Concepts

Rule

Each rule is a JSON object that defines “when to trigger, under what conditions, and what action to take”:

Rule = Hook timing(hook) + Match conditions(conditions/pattern) + Action(action) + Prompt(reason)

Action Types in Detail

ActionInjection MethodUser VisibleTypical Use Case
notifyInjects into AI context✅ YesRemind AI to run tests, lint
steerSilent injection❌ NoGuide AI to consult documentation
rewriteModifies tool parameters✅ YesAuto-prepend prefix to commands
blockPrevents execution✅ YesBlock dangerous operations

State Tracking

Shepherd maintains internal state counters for tool calls:

"state": { "countKind": "errors", "gte": 5 }

This means “trigger when cumulative errors ≥ 5 times.” countKind supports:

  • "errors": Counts when a tool returns an error
  • "calls": Counts when a tool is called

message_end Hook

message_end is a special hook timing: it triggers after the AI finishes its reply, matching the AI’s output text rather than tool parameters. This lets Shepherd “listen” to what the AI says and inject corrections when it spots problems.

AI reply completed
     │
     ▼
┌──────────────────────────────────────┐
│  Shepherd message_end hook            │
│  Regex match on AI reply text         │
└──────┬───────────────────────────────┘
       │
   Matched?
   ├── Yes → steer: silently inject correction (next turn)
   └── No → no-op

Difference from tool_call/tool_result:

Dimensiontool_call / tool_resultmessage_end
Match targetTool parameters (commands, file paths)AI’s reply text
conditions[].fieldpath or text (tool params)text (AI reply)
Typical actionsblock/notify/rewrite/steerUsually steer only
Typical useBlock dangerous commands, post-edit remindersIntercept wrong guesses, guide corrections

Cross-Extension Communication

Shepherd receives “hints” from other extensions via the pi.events event bus:

Other extension emits hint → pi.events.emit("ephemeral:hint") → Shepherd collects
                                                                          │
At before_provider_request → Shepherd injects collected hints into AI context

This mechanism allows extensions to collaborate without direct dependencies on each other.

Rule Loading Flow

1. Load rules.json inside the extension package (global default rules)
     │
     ▼
2. Scan project directory for .pi/shepherd-rules-*.json (project rules)
     │
     ▼
3. Rules stack and take effect (project rules override global rules with the same name)

Rule file changes take effect immediately — no action needed.

Editing Rules with shepherd_rules

The shepherd_rules tool provides safe rule editing with built-in validation:

# List all rules (global + project merged)
shepherd_rules(action="list")

# Add a global rule
shepherd_rules(action="add", rule={...})

# Add a project-level rule
shepherd_rules(action="add", scope="project", rule={...})

# Update rule #2 in global rules
shepherd_rules(action="update", index=2, changes={"action": "block"})

# Delete rule #0 in project rules
shepherd_rules(action="delete", scope="project", index=0)

scope parameter:

  • global (default): Operate on ~/.pi/agent/extensions/shepherd/rules.json
  • project: Operate on <cwd>/.pi/extensions/shepherd-rules.json
  • list without scope: Returns merged view from both levels

Safety features:

  • Validates required fields and regex patterns before writing
  • Reads back after write to verify
  • Auto-restore from backup on failure
  • Same-signature rules (tool + hook + pattern/check + action) auto-overwrite instead of appending

Configuration

Shepherd supports three-layer config merging (defaults → global settings → project settings). Override in .pi/settings.json as needed.

📖 For complete configuration parameters, see pi-shepherd README.

Next Up

Now that you understand how Shepherd guards AI behavior, the next section covers how Context Manager controls information quality.

3.3 pi-context-manager: Information Quality Control & Token Diagnostics

pi-context-manager is the merger of the original pi-context and pi-payload-analyzer, providing unified management of context quality and token diagnostics.

Three Core Capabilities

1. Distill: Compressing the Flood of Tool Output

When a tool returns a large amount of content (e.g., reading a 1000-line file), pi-context-manager automatically compresses it, keeping only essential information:

Raw tool output (50KB)
     │
     ▼
┌────────────────────────┐
│   Distill Processor     │
│   Extract key lines +   │
│   summary               │
└────────────────────────┘
     │
     ▼
Compressed output (~5KB)
     │
     ▼
AI sees refined information

Distill is enabled by default. Two key parameters:

ConfigCommandDescription
distillThreshold/distill-configTool outputs exceeding this token count will be compressed
firstSeenCap/distill-config --capMaximum token cap for first-encountered tool output (0 = no limit)

💡 Purpose of firstSeenCap: Some tools return massive results on first use (e.g., ls listing a large directory), but you don’t need all of it. firstSeenCap limits the initial output size; subsequent requests may further compress the result through distill.

2. Tool Result Processor: Smart Formatting & Trimming

The Tool Result Processor performs structured trimming on specific tool outputs, more precise than distill:

  • Code Graph output trimming: Auto-compresses AST search results, preserving only key signatures and locations
  • MCP JSON output trimming: Compresses verbose JSON returned by MCP tools
  • Error output trimming: Truncates overly long error stack traces
  • Web search output trimming: Keeps only key information from search results

Use the /processor-config command to view or adjust processing thresholds.

3. Aging: Phasing Out Stale Content

In long sessions, early tool outputs may no longer be relevant. The Aging mechanism automatically phases out “outdated” content:

Round 1:  Tool Output A (fresh 🟢)
Round 5:  Tool Output A (a bit old 🟡)
Round 10: Tool Output A (too old 🔴 → auto-deleted)

Aging Smart Exemptions: Certain content types are protected from aging:

  • Skill files (SKILL.md) content
  • User-flagged content
  • Content most recently referenced by the AI

Use /aging-config to set the eviction round count, or /aging-config off to disable.

4. Payload Analysis: Diagnosing Context Issues with Data

Is the AI getting dumber as the session grows long? Use payload_analyze to find out.

⚠️ Important: payload_analyze is an AI tool, not a terminal command. You ask the AI to run it in your pi chat. For example:

Help me check the current token usage with payload_analyze

Or more precisely:

Run payload_analyze action="budget"
Analysis ModeHow to Ask the AIWhat It Shows
budget“Analyze token budget distribution”Token ratio of system/tools/history sections
growth“Show token growth trend”How tokens expand over the course of a session
expensive“Find the most token-hungry tool calls”Top N most expensive tool calls
overview“Detailed payload analysis”Per-message token breakdown
messages“View message #5”Pinpoint messages by index/range/keywords
chain“Trace this tool call”Track a single tool call across payloads
diff“Compare two payloads”Find differences between two requests
stats“Show distill/processor hit rate”Aggregate compression efficiency statistics

💡 Start with budget, then dive deeper: When facing context issues, first use budget for an overview, then expensive to pinpoint the heavy hitters, and finally messages to examine a specific message.

/context TUI Panel

pi-context-manager also provides a TUI (Terminal User Interface) panel for visually browsing context content:

/context command
     │
     ▼
┌─────────────────────────────────────┐
│  📊 Context Panel                    │
│                                      │
│  [Categories] [Tool Details]         │
│  [Mark for Deletion]                 │
│                                      │
│  ├─ System Prompt    4.2K tokens     │
│  ├─ Tool Definitions 8.1K tokens     │
│  ├─ Memory           2.3K tokens     │
│  ├─ History          52K tokens      │
│  │   ├─ Rounds 1-10  (marked delete) │
│  │   ├─ Rounds 11-20                 │
│  │   └─ Rounds 21-30                 │
│  └─ Tool Results     64K tokens      │
│      ├─ read(schema.ts)  8.2K 🔴     │
│      └─ grep("TODO")    4.1K 🟡     │
└─────────────────────────────────────┘

In the panel you can:

  • Browse by category: View context content by type
  • Tool details: See full content returned by each tool
  • Mark for deletion: Manually flag unwanted content for exclusion in the next request

Complete Command Reference

CommandPurposeBehavior without args
/record [on|off]Toggle payload recordingToggle on/off
/contextOpen TUI visualization panel
/distill-config [N]View/set distill thresholdShow current config + usage
/distill-config --cap [N]View/set firstSeenCapShow current config + usage
/processor-config [N|off]View/set processor thresholdShow current config + usage
/aging-config [N|off]View/set aging round countShow current config + usage
/context-clean [sessionId]Clean persistent dataClean all data

Best Practices

Issue You’re FacingFirst StepNext StepSolution
AI gets dumber after 30 roundspayload_analyze(action="growth")Check which phase tokens spikeLower distill threshold / install smart compact
AI ignores certain file contentCheck distill configMay be over-compressed by distillAdjust distillThreshold
Every tool call is painfully slowpayload_analyze(action="expensive")Find the most expensive callsLimit large file reads or split files
Old tool outputs consume spaceRun /aging-configSet appropriate eviction roundsAging auto-evicition + manual /context panel cleanup
First tool output is too largeSet /distill-config --capLimit initial full-text outputfirstSeenCap limits first output size

Next Steps

In the next chapter, we’ll explore how to teach the AI to review — automatically record session events and revisit history at any time.

Original: /home/lain/.pi/agent/distill/processor/read-b63ebc90-1779883939893.txt

3.5 Shepherd in Practice: Real-World Scenarios

This section demonstrates how to use Shepherd rules to solve common problems in AI-assisted coding through real-world scenarios.

Scenario 1: Auto-Prompt for Running Tests After Code Edits

Problem

The AI modifies TypeScript code but forgets to run tests. You have to manually say “run the tests” every time.

Rule

{
	"comment": "[TypeScript] Must run tests after edits",
	"hook": "tool_result",
	"tool": "edit",
	"action": "notify",
	"conditions": [
		{ "field": "path", "pattern": "\\.ts$", "flags": "" }
	],
	"reason": "You edited a TypeScript file. You must run unit tests covering the code (add tests first if none exist) and fix all issues to ensure they pass.",
	"enabled": true
}

Effect

AI: I modified the null-check logic in src/auth/login.ts.
🛡️ Shepherd reminds: You edited a TypeScript file. You must run unit tests covering the code.
AI: Got it, let me run the tests... ✅ All 3 tests pass.

Scenario 2: Preventing the AI from Messing with Others’ Code

Problem

You’re working in a team project. A colleague has uncommitted changes in the workspace. The AI sees “something wrong here” and casually runs git checkout to restore their files.

Rule

{
	"comment": "[Safety] Block git checkout -- to restore files",
	"hook": "tool_call",
	"tool": "bash",
	"action": "block",
	"conditions": [
		{ "field": "text", "pattern": "git\\s+checkout\\s+--", "flags": "" }
	],
	"reason": "❌ Blocked: git checkout -- to restore files! There are uncommitted changes from others in the workspace — you don't have the authority to decide which changes are 'unrelated'.",
	"enabled": true
}

Effect

AI prepares to run: git checkout -- src/config.ts
🛡️ Shepherd blocks: git checkout -- to restore files is not allowed!
AI: Sorry, I won't restore other people's files. Let me find another approach...

Scenario 3: Auto-Commit Code at Session End

Problem

The AI modified a bunch of files, the session ends, but the code isn’t committed. The next day, the workspace is a mess.

Rule

{
	"comment": "[Wrap-up] Prompt for commit + memory update + summary after edits",
	"hook": "agent_end",
	"action": "notify",
	"conditions": [{ "builtin": "has_edits" }],
	"reason": "File edits detected. Perform wrap-up tasks:\n1️⃣ Git commit...\n2️⃣ Update memories...\n3️⃣ Session summary",
	"stopReason": ["stop"],
	"enabled": true
}

check: "has_edits" ensures the notification only triggers when files were actually modified, avoiding interference in pure chat sessions. stopReason: ["stop"] ensures it only fires on normal termination, not interruptions.

Scenario 4: Auto-Prompt for Architecture Check After .gd File Edits

Problem

You’re working on a Godot game project. After the AI edits .gd files, it should run architecture checks and formatting checks — but you have to remind it manually every time.

Rule (multiple rules can be defined for the same file, executed in order)

{
	"comment": "[arch] Prompt for compilation validation after .gd edits",
	"hook": "tool_result",
	"tool": "edit",
	"action": "notify",
	"conditions": [
		{ "field": "path", "pattern": "\\.gd$", "flags": "" }
	],
	"reason": "You edited a .gd file. Please run check_arch to verify architecture compliance.",
	"enabled": true
},
{
	"comment": "[format] Prompt for formatting check after .gd edits",
	"hook": "tool_result",
	"tool": "edit",
	"action": "notify",
	"conditions": [
		{ "field": "path", "pattern": "\\.gd$", "flags": "" }
	],
	"reason": "You edited a .gd file. Please run gdformat for formatting checks.",
	"enabled": true
}

Both rules will fire, and the AI will run the architecture check followed by the formatting check.

Scenario 5: Auto-Prompt to Check Memory on Repeated Tool Errors

Problem

The AI keeps hitting errors — edit match failures, missing bash commands, tests failing repeatedly. It’s circling in the same dead end.

Rule

{
	"comment": "[debug] Prompt to check memory on repeated tool errors",
	"hook": "tool_result",
	"action": "steer",
	"state": { "countKind": "errors", "gte": 5 },
	"reason": "🔍 **Repeated tool errors**: Failed multiple times consecutively. Check the memory files under .pi/memory/ directory for existing troubleshooting records.",
	"enabled": true,
	"subagent": false
}

Key points:

  • state: { "countKind": "errors", "gte": 5 } — only triggers after 5 consecutive errors, won’t bother you every time
  • action: "steer" — silently injects guidance, invisible to the user interface
  • subagent: false — won’t fire in sub-agents, avoiding interference with independent tasks

Effect

AI tries edit, fails...
AI tries edit, fails...
AI tries bash sed, fails...
AI tries edit, fails...
AI tries edit, fails...
🛡️ Shepherd silently guides: Check the memory files.
AI: Let me check the memories... Found it! The memory file says "when edit match fails, first check for CRLF".
AI: Running audit_format.py to check format... It is indeed a CRLF issue.

Scenario 6: Auto-Rewriting High-Frequency Commands

Problem

The AI frequently runs commands like git status, git log, npm test, etc. Their output can be lengthy, wasting tokens.

Rule

{
	"comment": "[rtk] Auto-proxy frequent bash commands",
	"tool": "bash",
	"action": "rewrite",
	"pattern": "^(git\\s+(status|log|diff)|cargo\\s+(test|build|clippy)|pytest)\\b",
	"flags": "",
	"reason": "rtk command rewrite: auto-prepend rtk prefix to compress output",
	"enabled": true
}

When the AI runs git status, Shepherd automatically rewrites it to rtk git status (rtk is an output compression tool), reducing token consumption. The AI doesn’t need to know about this rewrite — to it, the result just looks cleaner.

Scenario 7: Intercept AI’s Wrong Attribution Guesses (message_end)

Problem: When encountering errors, the AI sometimes attributes failures to toolchain issues (“jiti cache problem”, “vitest proxy issue”) instead of its own code bugs. This wastes debugging time and misleads the investigation.

Shepherd Solution: Use message_end hook to match patterns in AI’s reply and inject corrections.

{
	"comment": "[message_end] Intercept wrong toolchain blame",
	"hook": "message_end",
	"action": "steer",
	"conditions": [{
		"field": "text",
		"operator": "matches",
		"value": "jiti.*缓存|缓存.*jiti|vitest.*proxy"
	}],
	"reason": "Don't blame toolchain — check your own code first. 90% of the time it's a code bug, not a cache/proxy/module resolution issue.",
	"enabled": true
}

When the AI mentions “jiti cache” or “vitest proxy” in its reply, Shepherd silently injects guidance to redirect it toward checking its own code.

Rule Design Pattern Summary

PatternActionUse Case
Post-edit remindernotify + conditionsRun tests, lint, formatting after code changes
Dangerous operation blockblock + conditionsBlock git checkout --, prevent file deletion
Session wrap-up automationagent_end + checkAuto-commit + memory update at session end
Repeated error guidancesteer + stateGuide the AI to check memories when it keeps failing
Wrong guess interceptionmessage_end + steerIntercept AI’s wrong attribution guesses
Command rewritingrewrite + patternAuto-prepend prefix to compress command output

📖 Return to 3.1 Setting Rules for AI for the complete rule field reference.

Original: /home/lain/.pi/agent/distill/processor/read-b63ebc90-1779883939894.txt

Teaching the AI to Review

You’ve Probably Experienced This

You had a 3-hour session where the AI helped you do a lot:

  • Fixed two bugs
  • Refactored a module
  • Set up the CI pipeline
  • Wrote a bunch of tests

The next day you want to review: “How exactly did I fix that login bug yesterday?” But all you have is a vague memory — was it auth.ts or middleware.ts? Did you add a null check or change a type assertion?

You dig through the git log, and the commit message reads “fix: update auth” — which is as good as nothing.

💡 The AI did a lot, but nobody recorded the “why”. Git only tracks what changed, not the thought process behind it.

Core Tool: pi-session-analyzer

pi-atelier provides Session Analyzer to search and analyze historical sessions:

FeatureDescription
Cross-session searchSearch all historical sessions by keyword
Search by fileFind all sessions that modified a specific file
Timeline viewView the complete flow of a session chronologically
Summary generationAuto-summarize what a session accomplished
Branch analysisAnalyze parallel branches created by /tree
Takeover report5-dimensional context to help the AI quickly resume work
Audit checkCheck for rule violations in a session

💡 About pi-journal: pi-journal can generate daily/weekly reports via the /journal command or journal tool, aggregating git activity, memory changes, and session activity from three data sources. See 4.2 pi-journal Principles for details.

Real-World Example: Finding Last Week’s Bug Fix

You say: “Help me find the session where I fixed the DuckDB timezone issue.”

The AI invokes session_search:

🔍 Searching for "DuckDB timezone" — found 2 matching sessions:

1. 05-22 19:36 — DuckDB timezone config fix
   Modified src/db/connection.ts
   
2. 05-20 14:30 — Database initialization discussion
   Discussed timezone strategy but didn't change code

Scenario: Viewing Session Details

You say: “Show me the detailed process of the first session.”

The AI invokes session_analyze, generating a timeline:

📅 Session Timeline (05-22 19:36)

[19:36] User: DuckDB queries return UTC time, need to switch to China timezone
[19:37] AI: Read db/connection.ts
[19:38] AI: Found no timezone parameter set in the connection
[19:39] AI: Modified connection.ts, added SET timezone = 'Asia/Shanghai'
[19:40] AI: Ran tests, 2 tests failed
[19:41] AI: Found test mocks missing timezone setting
[19:42] AI: Modified test mocks
[19:43] AI: All tests passed ✅
[19:44] AI: Committed git commit "fix: set DuckDB timezone to Asia/Shanghai"

Scenario: Tracking by File

You say: “Who modified connection.ts recently?”

The AI invokes session_search in file mode:

📄 Modification history for src/db/connection.ts:

1. 05-22 19:36 — DuckDB timezone fix (you)
2. 05-20 14:30 — DB connection pool config (colleague)
3. 05-18 09:15 — Initial creation (you)

Session Analyzer’s Compact Mode

The entries action supports a compact parameter for streamlined output, ideal for quickly browsing large sessions:

# Standard mode
session_analyze(action="entries")
→ Complete records (with timestamps, type, full content)

# Compact mode
session_analyze(action="entries", compact=true)
→ Removes type column, time shows only HH:MM, previews 60 chars
→ Best for large sessions with 100+ entries

Session Analyzer Analysis Dimensions

ModePurposeExample Command
summarySession overview“What did this session do?”
entriesPer-event listing“List all file modifications”
timelineChronological flow“What order did the AI operate in?”
chainSub-agent tracking“What did the sub-agent do?”
auditCompliance check“Were there any rule violations?”
digestConversation sequence“What did I and the AI discuss?”
takeoverHandover report“Help me pick up where I left off”

The Most Useful Mode: takeover

takeover generates a handover report with 5 dimensions:

📋 Session Takeover Report

1. User intent: Fix DuckDB timezone issue
2. Modified files: connection.ts, connection.test.ts
3. Recent steps: Modified test mocks, tests pass
4. Next steps: Consider documenting timezone behavior
5. Key decisions: Chose to handle timezone at the connection layer, not the SQL layer

When you want to “continue where you left off,” this report helps you (or another AI) quickly restore context.

Best Practices

✅ Efficient Use of Session Analyzer

  • grep mode: Search keywords across all sessions (much faster than digging through git log)
  • file mode: Find all sessions touching a specific file (a code review essential)
  • takeover mode: When taking over someone else’s work, generate a handover report first
  • compact mode: Quickly browse large sessions with streamlined output
  • audit mode: Periodically check for AI rule violations

❌ Common Pitfalls

  • Don’t use session_search as a replacement for memory — search is about looking back (what was done), memory is about knowledge (what was learned)
  • Don’t expect to find full source code — session records are summaries, not complete backups

Advanced Scenarios

Scenario: Auditing AI Compliance

You’ve set many rules for the AI (don’t write settings.json directly, don’t overwrite large files), but you’re not sure if it actually follows them. Use audit mode to check:

You: Audit the last session for rule violations

AI:
  🛠 session_analyze(action="audit", sessionId="...")
  
  ⚠️ Found 2 issues:
  1. [Violation] Direct writeFileSync(settings.json) — should use patchSettingsSectionWithBackup
  2. [Warning] Large file overwrite (312 lines) — should split or use edit

The audit mode checks for: prohibited operations, whether file modification rules were followed, and any unsafe actions.

Scenario: Understanding the AI’s Exploration Process

The AI used /tree to create exploration branches — tried Plan A (failed), then Plan B (succeeded). You want to know why Plan A failed:

You: The AI tried two approaches last time. Show me the results for each branch.

AI:
  🛠 session_analyze(action="branches", sessionId="...")
  
  🌿 Branch Analysis
  
  [Main branch] Discussed refactoring approach
  [B1] Plan A — Modified function signature + compatibility layer
      Result: Tests failed, compatibility layer introduced circular dependency
  [B2] Plan B — New interface + incremental migration
      Result: All tests passed ✅, merged back to main

This is more useful than digging through git log — git only records the final result, but branches shows the AI’s trial-and-error process.

Scenario: Tracking Sub-Agent Activities

Multiple sub-agents were spawned via subagent in a session. You want to know what each sub-agent produced:

You: What did the sub-agents do in the last session?

AI:
  🛠 session_analyze(action="chain", sessionId="...")
  
  🔗 Sub-Agent Chain
  
  Main Agent
    ├──→ pv-explorer
    │     Task: Analyze architecture of src/auth/ directory
    │     Result: 5 modules, 2 design patterns, dependency directions correct
    │
    └──→ pv-reviewer
           Task: Review JWT → session migration plan
           Result: Found 1 data model violation, 2 test gaps

The chain mode traces the call relationships between the main agent and sub-agents, clearly showing what task each sub-agent received and what result it returned.

Next Steps

With the ability to review, both you and the AI can look back at past work. But there’s still one problem: the longer a session goes on, the more the AI tends to “get dumber” — repeating itself, forgetting previous agreements.

In the next chapter, we’ll look at how to keep the AI smart in long-running sessions.

Original: /home/lain/.pi/agent/distill/processor/read-b63ebc90-1779883939894.txt

4.2 pi-journal: Automated Log Reports

Why Do You Need Logs?

You’ve been working with pi all day and accomplished a lot:

  • Fixed two bugs in the morning, along with refactoring a module
  • Set up the CI pipeline in the afternoon, wrote a bunch of tests
  • Researched a new approach in the evening, updated memory files

At night, you want to review: “What exactly did I do today?” You check git log — it’s all fragmented commits. You check memory files — only key conclusions are recorded. You check session records — there are a dozen sessions and you don’t know where to start.

💡 You need an “auto daily report” — something that aggregates your activities scattered across git, memory, and sessions into a readable report.

What pi-journal Does

pi-journal collects data from three sources and automatically generates Markdown daily/weekly reports:

Data SourceCollected Content
Git ActivityScans all repos under ~/.pi/agent/git/, counts commits, file changes, lines added/deleted
Memory ChangesScans global and project memory directories, identifies file changes within the time range
Session ActivityScans pi session records, counts sessions, tool calls, edit operations, active duration

Usage

Command: /journal

# Today's daily report (default)
/journal

# Specify a time range
/journal yesterday
/journal this_week
/journal 3d          # Last 3 days
/journal 2025-05-27  # Specific date

Tool: journal

AI can also proactively call the journal tool to generate a report. When you say “write a log”, “what did I do today”, or “write a weekly report”, the AI will automatically trigger it.

Generated Report Format

# 📓 Daily Report — 2025-05-27

## Git Activity
- **pi-shepherd** (2 commits, +45/-12)
  - Added tool_result rule support
  - Fixed priority sorting bug
- **pi-context-manager** (1 commit, +30/-5)
  - Added aging auto-eviction feature

## Memory Changes
- Added: debug_anti_pattern.md
- Updated: coding_standards.md

## Session Activity
- 019e6494 (45m) — Shepherd rule engine refactoring
  - Tool calls: 23, Edits: 8
- 019e6203 (30m) — Context aging feature implementation
  - Tool calls: 15, Edits: 5

## Summary
- Total commits: 3
- Total sessions: 12
- Total edits: 43

Best Practices

  • At the end of each day: /journal to generate a daily report and review what you did
  • Friday wrap-up: /journal this_week to generate a weekly report
  • Let AI do it: Just say “write today’s log” and the AI will invoke the tool to generate it

⚠️ Notes

  • The “AI Summary” section requires the AI to supplement after generation
  • Git activity only scans repos under ~/.pi/agent/git/, not other git repos elsewhere on the system
  • Session activity depends on pi’s session record storage

How It Works

User inputs a time range
      ↓
  parseTimeRange()  →  Parses into since/until timestamps
      ↓
  ┌─────────────┬──────────────┬──────────────────┐
  │ Git Activity │ Memory Changes│ Session Activity │
  │ Auto-discover│ Scan memory/ │ Get session list │
  │ repos        │ file timestamps│from session-    │
  │ git log stats│               │ analyzer         │
  └──────┬──────┴──────┬───────┴────────┬─────────┘
         ↓             ↓                ↓
      renderReport()  →  Aggregate and render as Markdown
         ↓
      Output report

Next Steps

pi-journal solves the “look back at the past” need. But sometimes you need more than just a review — you need to find a specific session and see what the AI was thinking at the time. In the next section, we’ll look at the detailed usage of pi-session-analyzer.

Original: /home/lain/.pi/agent/distill/processor/read-ed6e48fc-1779884015233.txt

4.3 pi-session-analyzer: Cross-Session Search

pi-session-analyzer is the “time machine” of pi-atelier — it can search and analyze all historical sessions, helping you and the AI look back at what happened.

Why Session Analysis?

Every pi conversation is recorded in JSONL files (under ~/.pi/agent/distill/), but the raw data is not human-readable. Session Analyzer transforms this data into searchable, analyzable structured information.

Three common needs:

NeedSolutionExample
Find a specific sessionsession_search cross-session search“Which session was the one where I fixed DuckDB?”
Understand session contentsession_analyze single-session analysis“What exactly happened in that session?”
Take over someone else’s worktakeover handover report“Continue from where I left off last time”

Three search modes:

Search the content of all sessions (including user messages and AI responses):

session_search(action="grep", query="DuckDB timezone")

Result:
  3 sessions matched:
  1. 05-22 19:36 — DuckDB timezone config fix
  2. 05-20 14:30 — Database initialization discussion
  3. 05-18 09:15 — Tech stack discussion

Search results include context snippets, so you can tell if a session is relevant without opening it.

Advanced usage: editOnly=true only searches sessions that contain file editing operations, filtering out pure discussion:

session_search(action="grep", query="settings.json", editOnly=true)

Result:
  2 sessions edited settings.json

This is useful for tracking.

file Mode — Track by File

Find all sessions that modified a specific file:

session_search(action="file", query="src/auth/login.ts")

Result:
  3 sessions modified this file:
  1. 05-22 19:36 — Login bug fix (changed null check)
  2. 05-20 14:30 — Auth module refactoring (changed function signature)
  3. 05-18 09:15 — Initial creation

Use case: During code review, you want to know “why is this file the way it is” — each session represents a modification intent.

list Mode — Browse Recent Sessions

List all recent sessions:

session_search(action="list", limit=10)

Result:
  Recent 10 sessions:
  1. 05-27 11:24 — Check what's left to do
  2. 05-27 11:08 — Payload analysis script enhancement
  3. 05-27 11:06 — Roadmap session ID display fix
  ...

session_analyze: Single-Session Analysis

Session Analyzer offers multiple analysis dimensions for different needs:

⚠️ Note: The action parameter of session_analyze only accepts the following values; do not pass grep/file/list (those are session_search actions).

summary — Quick Overview of a Session

session_analyze(action="summary", sessionId="019e6765")

Result:
  Session Summary (31 exchanges)
  User intent: Fix roadmap session ID display bug
  Key operations: Discovered formatTimestamps slice(0,8) truncation error
  Output: 2 bug fixes, 145 tests passing

When to use: When you don’t know what a session is about, start with summary.

entries — Browse Events One by One

Supports precise filtering, pagination, and multi-dimensional positioning:

# View the last 10 entries
session_analyze(action="entries", limit=10)

# Start from entry 20 (pagination)
session_analyze(action="entries", offset=20, limit=10)

# Range extraction — 'last:50' for the end, '100-150' for a specific range
session_analyze(action="entries", range="last:50")

# By index — view entry #5 with 3 surrounding context entries
session_analyze(action="entries", index=5)

# Filter by keyword
session_analyze(action="entries", grep="edit|write")

# Filter by tool name (supports wildcards)
session_analyze(action="entries", toolName="read|edit")

# Filter by file path (matches tool parameter paths, supports wildcards)
session_analyze(action="entries", file="*.test.ts")

# Compact mode — quick browse of large sessions
session_analyze(action="entries", compact=true)

Parameter Quick Reference:

ParameterDescriptionPriority
rangeRange extraction: "5-10", "last:50"Highest
indexView entry #N with context (0-based)High (mutually exclusive with offset)
offset + limitPagination: start from #N, show M entriesMedium
grepKeyword/regex filterComposable
toolNameFilter by tool name (supports * wildcard, `` for multiple)
fileFilter by file path (supports * wildcard, `` for multiple)
rawIndexNavigate back to original context position (after grep/toolName filter)Navigation only

rawIndex usage: After filtering with grep or toolName, use rawIndex to jump back to the original context (view surrounding entries of a filtered item):

# First find an edit record (shown as [42]) with toolName filter
session_analyze(action="entries", toolName="edit")

# Then jump back to original position to see context
session_analyze(action="entries", rawIndex=42)

When to use:

  • You want to see what specific operations the AI performed
  • Search for specific types of operations in a session (e.g., all file edits)
  • Find all operations that touched a specific file
  • Quick browse of large sessions

timeline — Timeline View

Display operations in chronological order:

session_analyze(action="timeline", sessionId="...")

Result:
  📅 Timeline
  [19:36] 👤 DuckDB query returns UTC time
  [19:37] 🤖 Read db/connection.ts
  [19:38] 🤖 Discovered no timezone parameter set
  [19:39] 🤖 Modified connection.ts
  [19:40] 🤖 Ran tests — 2 failures
  [19:42] 🤖 Modified test mock
  [19:43] 🤖 All tests passed ✅

When to use: When you want to understand the AI’s step-by-step operations and decision process.

chain — Sub-Agent Tracking

Track sub-agent call chains:

session_analyze(action="chain", sessionId="...")

Result:
  🔗 Sub-agent chain
  Main agent → pv-explorer (code exploration)
  Main agent → pv-reviewer (plan review)
  Main agent → pv-executor (execute changes)

When to use: When a session used sub-agents and you want to know what each one did.

audit — Audit Checks

Check for rule violations in a session:

session_analyze(action="audit", sessionId="...")

Result:
  ⚠️ Found 2 issues:
  1. [Violation] Directly wrote settings.json instead of using patchSettingsSectionWithBackup
  2. [Warning] Large file write overwrite (>200 lines), should split

When to use:

  • Check if the AI followed project conventions
  • Review someone else’s session for issues
  • Regular quality checks

digest — Conversation Sequence

Extract user/assistant conversation from a session, stripping tool call details and keeping only human-readable dialogue:

session_analyze(action="digest", sessionId="...")

Result:
  👤 Help me fix the roadmap display bug
  🤖 Sure, let me take a look at the code first...
  👤 Tests aren't passing, take a look
  🤖 Found that formatTimestamps has a truncation logic error...

When to use: When you want a quick understanding of the conversational thread between user and AI without seeing tool details.

raw — Raw Data

View the raw JSONL records directly (10 entries max by default):

session_analyze(action="raw", sessionId="...", limit=5)

When to use: When none of the analysis modes above meet your needs, look at the raw data directly. Generally used for debugging or data format verification.

branches — Branch Analysis

Analyze parallel branches created by the /tree command:

session_analyze(action="branches", sessionId="...")

Result:
  🌿 Branch Analysis
  [Main branch] Normal workflow
  [B1] Tried approach A for refactoring → Failed, returned to main branch
  [B2] Tried approach B for refactoring → Succeeded

When to use: When a session used /tree to create exploration branches and you want to understand the results of each branch.

Data Storage

Session Analyzer’s data sources:

~/.pi/agent/sessions/
├── --home-lain-.pi--/                ← Session directories grouped by CWD
│   ├── 2026-05-27T..._sessionId.jsonl  ← Complete record for each session
│   └── ...
└── --other-project-path--/          ← Sessions from different project directories
    └── ...

Session records are in JSONL format (one JSON object per line), containing:

  • User messages
  • AI responses (including tool calls and results)
  • Timestamps
  • Branch markers

Next Steps

📖 Back to 4.1 Let AI Learn to Review for a complete usage example.

Original: /home/lain/.pi/agent/distill/processor/read-ed6e48fc-1779884015234.txt

A Survival Guide for Long Sessions

You’ve Probably Been Here

You start a long session with pi, and the AI helps you through a lot of work. By the 50th turn, you notice:

  • The AI starts asking questions you’ve already answered before
  • It re-proposes a solution that was already rejected
  • Its code quality noticeably declines — missing error handling, missing type definitions
  • Sometimes it even starts “hallucinating” — inventing functions and files that don’t exist

The worst case: The AI throws an error — “context window exceeded”, and the entire session crashes.

💡 This is the “context bloat” problem: The AI’s “working memory” has a fixed capacity, and when it’s overloaded, things spill over.

Root Cause

The AI’s context window is a fixed-size “workbench”:

Context Window (e.g., 128K tokens)
┌──────────────────────────────────────┐
│ System Prompt              ≈ 5K      │
│ Tool Definitions           ≈ 8K      │
│ Memory Injection           ≈ 2K      │
│ ─────────────────────────────        │
│ Conversation History       ≈ 80K     │ ← Main source of bloat
│ (previous 50 turns)                  │
│ Tool Results               ≈ 30K     │ ← Tool returns can be large
│ ─────────────────────────────        │
│ Remaining Space            ≈ 3K      │ ← Almost full!
└──────────────────────────────────────┘

The problem:

  1. Conversation history only grows: Each turn adds content to the context and never removes anything
  2. Tool results can be massive: read on a 1000-line file can take up 5K tokens
  3. Duplicate information accumulates: The AI reads the same file multiple times, each time consuming space

Two Tools: Smart Compact and Context Manager

You might ask: do I need to install both packages? The answer is: yes, both are recommended — they solve different problems:

Dimensionpi-smart-compactpi-context-manager
What it doesCompresses conversation historyDiagnoses token consumption + compresses tool results
Active/PassiveFully automaticDiagnosis needs you to ask AI, distill is automatic
What it solves“Conversation history is too long”“Too many tool results” + “Why is it so slow”
Can they replace each other?❌ No❌ No

💡 One-sentence summary: context-manager helps you find the problem (where are the tokens going), smart-compact helps you automatically fix it (compress history). Best used together.

pi-smart-compact — Smart Compression

Smart Compact automatically “compresses” conversation history when the context is nearly full:

Before compression (80K tokens of conversation history):
┌─────────────────────────────────┐
│ User: Help me look at auth.ts   │
│ AI: I read auth.ts... (500 words)│
│ User: Add a null check          │
│ AI: OK, I modified... (300 words)│
│ User: Run the tests             │
│ AI: Test results... (200 words) │
│ ... 50 turns repeated ...        │
└─────────────────────────────────┘

After compression (15K tokens summary):
┌─────────────────────────────────┐
│ Summary:                        │
│ - Added null check in auth.ts   │
│ - Modified corresponding test   │
│ - All tests passed               │
│ - Used JWT authentication scheme│
│ ... Key information preserved ...│
└─────────────────────────────────┘

Two-phase compression strategy:

PhaseMethodDescription
Phase 1Extract key information (decisions, file changes, conclusions)Traverse conversation, generate structured intent summary
Phase 2Discard low-value information (repeated file reads, intermediate debug output)Let LLM judge tool results batch by batch

pi-context-manager — Diagnostic Tool

pi-context-manager provides the payload_analyze tool to help you see exactly where your tokens are going:

📊 Token Budget Analysis

System Prompt:    4,200 tokens ( 3.2%)
Tool Definitions: 8,100 tokens ( 6.2%)
Memory Injection: 2,300 tokens ( 1.8%)
Conversation:    52,400 tokens (40.0%)
Tool Results:    64,800 tokens (49.5%)  ← The big one!
──────────────────────────────────────
Total:          131,800 / 128,000      ← Over budget!

Top 3 most expensive tool calls:
1. read(src/database/schema.ts)  — 8,200 tokens
2. code_graph_module_overview    — 6,400 tokens
3. grep("TODO|FIXME")           — 4,100 tokens

Real-World Case: Diagnosing a Context Crash

Real Scenario Review

Once, a session crashed at only 34.8% context usage. It seemed unlikely — only a third used?

Using pi-context-manager’s budget mode for analysis, the root cause was found:

Root cause:
34.8% of tool results were error output
→ Lots of repeated "Command not found" error messages
→ Each error consumed tokens without providing valuable information
→ Accumulated and prematurely exhausted the context

Solution: Added an after_bash hook in shepherd rules to automatically truncate error output from failed commands, preventing wasteful token consumption.

📈 Context Growth Trend

Request #1:  15K  ████
Request #5:  28K  ███████
Request #10: 45K  ████████████
Request #15: 72K  ████████████████████
Request #20: 98K  ██████████████████████████  ← Approaching limit
Request #23: 💥 Crash!

Key finding: The fastest growth was between turns 10-15, when extensive file searching was happening.

Optimization: Using code_graph_semantic_code_search (compact mode, returns only signatures and locations) instead of grep (returns full matching lines) reduced token consumption by 70%.

Configuring Smart Compact

Install via settings.json:

{
  "packages": ["pi-smart-compact"]
}

It takes effect automatically after installation — no additional configuration needed.

Optional Advanced Configuration

In .pi/settings.json:

{
  "smart-compact": {
    "auto": "auto"
  }
}
  • auto: Automatic trigger (default, recommended)
  • manual: Only responds to the /smart-compact command

Manual trigger: Type /smart-compact in the conversation. View configuration: Type /smart-compact-config.

Payload Analyzer Common Commands

CommandPurposeWhen to Use
budgetToken budget analysis (system/tools/history composition)“Where are my tokens going?”
growthContext growth trend (token curve over requests)“Why is it getting slower?”
expensiveMost expensive tool calls (Top N sorted)“Which tool consumes the most tokens?”
overviewPer-message detailed analysis (includes distill events)“Pinpoint a specific point in time”
messagesLocate messages by index/range/keyword“What did message 10 say?”
chainTrack the same tool call across payloads“What happened to this call later?”
chain-tcidTrack the same toolCallId across payloads“Verify distill behavior”
diffCompare differences between two payloads“What’s different between these two requests?”
statsAggregate statistics on distill/processor hit rate“How efficient is the compression?”
singleAnalyze a single payload file“Deep dive into one recording file”
listList all recording files“What’s available for analysis?”

💡 Diagnosis workflow: Start with list to see available recordings → budget for overall distribution → expensive to find the heavy hitters → messages for precise targeting.

Context Manager’s Aging and Processor

Beyond Distill, pi-context-manager offers two additional helper mechanisms:

Aging

Automatically evicts old tool outputs that haven’t been referenced for a long time. Use /aging-config to set the eviction rounds.

Special exemption: Skill file (SKILL.md) content is never evicted by aging, ensuring the AI always sees the currently loaded skills.

Tool Result Processor

Formats and trims specific types of tool output (e.g., code-graph AST search results, MCP JSON output). Use /processor-config to set thresholds.

/context TUI Panel

Type /context to open a visual panel for browsing context content by category and manually marking content for deletion.

Best Practices

✅ Habits for Healthy Long Sessions

  1. Use compact mode for search: code_graph_semantic_code_search(compact: true) saves 70% tokens over grep
  2. Compress early: Don’t wait until it crashes — trigger compression when the context approaches its limit
  3. Avoid repeated reads: Use memory to remember file content instead of repeatedly reading the same file
  4. Read large files in chunks: Use offset/limit to read only the parts you need, not the whole file
  5. Configure aging wisely: Set 8-12 rounds for eviction to automatically clean up stale content
  6. Run payload_analyze checkups regularly: Run budget once during a long session to catch problems early

✅ Diagnostic Priority

When context has issues:
  1. payload_analyze budget → Check total distribution
  2. payload_analyze expensive → Find the most expensive calls
  3. payload_analyze growth → Look at growth trends
  4. Targeted optimization (switch tools, add filters, adjust strategy)

❌ Common Misconceptions

  • “I still have 50% context left, nothing to worry about” → Wrong, tool results can suddenly balloon
  • “Compression will lose important information” → Smart Compact prioritizes preserving decisions and conclusions
  • “Just restart the session” → Treats the symptom, not the root cause — you’ll run into the same problem again

Next Steps

Now the AI has memory, planning, rules, review, and compression — it’s already quite a capable assistant. But it’s still “passive” — it only acts when you ask.

Can the AI work proactively? For example, automatically check code quality every day, or automatically run a research analysis?

In the next chapter, we’ll look at how to make the AI automate work.

Original: /home/lain/.pi/agent/distill/processor/read-ed6e48fc-1779884015234.txt

5.2 pi-smart-compact Principles: Two-Phase Enhanced Compaction

Smart Compact is an enhanced version of pi’s built-in Compaction mechanism — instead of simply truncating history, it “intelligently” decides what to keep and what to discard.

Why Enhanced Compaction?

pi’s built-in Compaction automatically compresses old conversations when the context approaches its limit, but it isn’t “smart” enough:

Built-in Compaction:
  100 rounds of conversation before compression → a generic summary after compression
  Problem: the summary is too coarse, critical details are lost, and tool call results are indiscriminately truncated.

Smart Compact’s improvement — intercepts pi’s compaction event and performs two-phase enhanced compaction:

PhaseWhat It DoesHow It Works
Phase 1: Intent SummaryExtract user intent, key decisions, current stateTraverse conversation, extract non-tool text from AI replies, generate structured intent summary
Phase 2: Tool FilteringDetermine which tool call results can be safely discardedPair all tool calls (call + result), let LLM decide keep/discard in batches
pi triggers compact event
  → Smart Compact takes over (if auto mode is on)
    → Phase 1: Extract intent summary (keep decisions, agreements, conclusions)
    → Phase 2: Evaluate tool results for keep/discard in batches
  → Output a refined conversation history, replacing pi's default rough summary.

The two phases are executed sequentially in one go — Smart Compact takes over the compaction event, first performs the intent summary, then filters tools, and finally outputs the refined result. They are not triggered in stages based on context usage rate.

Configuration

Installation

{
  "packages": ["pi-smart-compact"]
}

Commands

CommandUsage
/smart-compactManually trigger two-phase compaction
/smart-compact-config [auto|manual]View or switch between auto/manual mode

Auto/Manual Mode

  • auto (default): Automatically takes over when pi triggers a compact event, performing enhanced compaction
  • manual: Only triggers when the user executes /smart-compact

Advanced Configuration

Configure in ~/.pi/agent/settings.json:

{
  "smart-compact": {
    "enabled": true,
    "intentModel": "",
    "filterModel": "",
    "thinkingTruncateChars": 500,
    "toolCallTruncateChars": 2000,
    "toolResultTruncateChars": 5000,
    "filterBatchSize": 10
  }
}
ParameterDefaultDescription
intentModelEmpty (uses session default model)Model used for Phase 1 intent summary
filterModelEmpty (uses session default model)Model used for Phase 2 tool filtering
thinkingTruncateChars500Character limit for truncating thinking blocks
toolCallTruncateChars2000Character limit for truncating toolCall arguments
toolResultTruncateChars5000Character limit for truncating toolResult content
filterBatchSize10Number of tools evaluated per batch in Phase 2

What Does Compaction Preserve?

Smart Compact’s Phase 2 evaluates tool results based on the following priority:

PriorityContent TypeWhy Preserve
🔴 HighestUser’s explicit requirements and constraintsThese are the task objectives
🟠 HighKey decisions and reasoning for choicesPrevents AI from re-debating already rejected solutions
🟡 MediumFile modification records (edit/write)Lets AI know which files have been modified
🟢 LowFile reads and search resultsCan be re-executed
⚪ LowestFailed attempts and debugging processLessons have already been learned

Best Practices

  • Enable auto mode for long sessions: Smart Compact automatically takes over when pi is about to compact, preserving more critical information than the default compaction
  • Manual trigger is useful before critical operations: Run /smart-compact manually to clean up context before starting an important refactoring
  • Use with context-manager: Smart Compact compresses conversation history, while Context Manager’s distill compresses tool outputs — they complement each other
  • Use cheaper models for compaction: If you don’t want to waste the main model’s tokens, specify filterModel in the configuration to use a cheaper model

📖 Back to 5.1 Long Session Survival Guide for complete diagnosis and optimization cases.

5.3 Diagnosing Token Consumption with pi-context-manager

The functionality of pi-payload-analyzer has been merged into pi-context-manager. This section introduces how to use the unified payload_analyze tool to diagnose context issues.

Where Did All the Tokens Go?

In long sessions, AI becomes less intelligent often because the context is filled with “junk.” But what exactly is consuming tokens? Guessing won’t help.

pi-context-manager provides the payload_analyze tool, which uses data to tell you where your tokens are going.

Recording Must Be Enabled First

payload_analyze requires recorded payload data before it can analyze. In the conversation, type:

/record on

Recordings are saved to ~/.pi/agent/distill/recordings/. There is a slight performance overhead while recording; remember to turn it off with /record off when done.

Analysis Mode Quick Reference

Global Overview

ModeUsageOutput
listList all recording filesFile list + sizes
budgetToken budget analysisBreakdown of system/tools/history
growthGrowth trendToken usage curve over requests
statsAggregate statisticsDistill/processor hit rate, compression efficiency

Deep Diagnosis

ModeUsageOutput
expensiveMost expensive tool callsTop N sorted by token count
overviewPer-message detailed analysisToken breakdown per message + distill events
messagesPrecise message targetingFilter by index/range/keyword

Tracking & Comparison

ModeUsageOutput
chainTrack tool call fateCross-payload changes for the same argsSig
chain-tcidTrack toolCallIdVerify distill behavior
diffCompare two payloadsIdentify differences between two requests
singleAnalyze a single fileFull analysis of one recording file

Messages Mode — Precise Targeting

messages is the most flexible diagnostic tool, supporting multiple filtering methods:

# View message #5 (0-based)
payload_analyze(action="messages", msgIndex=5)

# View messages 5-10
payload_analyze(action="messages", msgRange="5-10")

# View the last 5 messages
payload_analyze(action="messages", msgRange="last:5")

# Filter by keyword
payload_analyze(action="messages", grep="error|fail")

# Filter by tool name
payload_analyze(action="messages", toolName="read")

# Filter by file path
payload_analyze(action="messages", file="*.ts")

Practical Cases

Case 1: Find the Root Cause of Context Bloat

Step 1: Use budget mode to see totals
You: "Use payload_analyze to analyze token budget"
Result: Tool Results account for 49.5%

Step 2: Use expensive mode to find the biggest consumers
You: "Find the Top 10 most token-consuming tool calls"
Result: read(schema.ts) consumes 8.2K tokens

Step 3: Optimize
→ Use offset/limit to read large files in chunks
→ Or enable distill for automatic compression

Case 2: Diagnose Compression Efficiency

Step 1: Use stats mode to check hit rate
You: "Check distill and processor compression efficiency"
Result: distill hit rate 75%, processor hit rate 60%

Step 2: Use chain mode to track
You: "Track distill behavior for read(schema.ts)"
Result: Distilled on the 3rd request, compressed from 8.2K to 1.5K

Case 3: Compare Differences Between Two Requests

You: "Compare these two payloads for differences"
AI calls payload_analyze(action="diff", payloadPath="...", payloadPath2="...")
Result: The second request has 3 additional tool calls, but 2 were compressed by distill

📖 For complete long session diagnosis cases, see 5.1 Long Session Survival Guide

5.4 Long Session Real-World Scenarios

This section walks through real-world use cases, demonstrating how to combine pi-atelier’s tools to solve common problems in long sessions.

Scenario 1: AI Gets Dumber — Compress with Smart Compact

Symptoms

You’ve been chatting with AI for 2 hours and done a lot. Suddenly you notice the AI starts to:

  • Ask questions you’ve already answered
  • Re-propose solutions that have already been rejected
  • Code quality drops noticeably — missing error handling

Traditional Compaction vs Smart Compact

pi’s built-in Compaction triggers automatically when the context approaches its limit, but it simply compresses old conversations into a generic summary. Smart Compact is smarter:

Traditional Compaction:
  100 rounds of conversation → one generic 500-word summary
  Problem: critical details are lost, AI doesn't know what was decided

Smart Compact (two phases):
  Phase 1 (Intent Summary):
    → Extracts: decisions, agreements, file modifications, conclusions
    → Preserves all critical information, discards redundant processes
  
  Phase 2 (Tool Filtering):
    → Evaluates each tool result batch by batch to decide whether to keep it
    → Discards: repeated reads, failed attempts, debugging processes

Steps

1. AI has already done a lot for you, and you feel the context is nearly full
2. Type /smart-compact
3. Smart Compact analyzes the conversation history and generates an enhanced summary
4. AI continues working, but "remembers" all key decisions

Or do nothing — if configured in auto mode (default), Smart Compact will trigger automatically at the right time.

Works Better with Context Manager

Smart Compact compresses conversation history, while Context Manager’s Distill compresses tool results. Using them together:

Context Window
├── Conversation History ←── Compressed by Smart Compact (preserves decisions and conclusions)
├── Tool Results         ←── Compressed by Distill (preserves key info, discards redundancies)
└── Memory Injection     ←── Fixed size, unchanged

Scenario 2: Context Already Exploded — Handoff with Takeover

Symptoms

A more extreme case: AI reports an error — “context window exceeded.” The entire session can no longer continue.

At this point, mere compression is too late — the session has crashed outright.

Solution: Start a New Session + Takeover

1. Open a new session
2. Tell AI: "Help me take over the last session's work"
3. AI calls session_analyze(action="takeover")
4. Generates a 5-dimension takeover report:

📋 Session Takeover Report

1. User Intent: Refactor auth module, switch from JWT to session-based auth
2. Modified Files:
   - src/auth/middleware.ts (done)
   - src/auth/login.ts (in progress, 80% complete)
   - src/auth/__tests__/login.test.ts (to be written)
3. Recent Steps:
   - Modified type signatures in middleware.ts
   - Started modifying login.ts but not finished
   - Tests not yet written
4. Next Steps:
   - Complete login.ts modifications
   - Write login.test.ts tests
   - Run full test suite
5. Key Decisions:
   - Chose session-based over refresh token approach
   - Reason: project doesn't need cross-domain SSO

Takeover’s 5 Dimensions

DimensionWhat It ContainsWhy It Matters
User IntentOriginal requirements and goalsNew AI knows “what to do”
Modified FilesList of files changed and pendingKnows “what was changed”
Recent StepsLast 3-5 operationsKnows “how far we got”
Next StepsWhat still needs to be doneKnows “what to do next”
Key DecisionsImportant technical choices and rationaleKnows “why it was done this way”

With these 5 dimensions, a new session’s AI can fully recover working context within 1-2 conversation rounds.

Example

You: Help me take over the Godot MCP Bridge work

AI:
  🛠 session_analyze(action="takeover", sessionId="019e620f-...")
  
  📋 Takeover report generated. You were debugging the MCP Bridge WebSocket 
  connection issue and had identified that nohup was swallowing the listen() 
  return value. The next step is to start the TCP port directly without nohup. 
  Shall I continue?

Scenario 3: Want to Know What Happened — Session Analysis

Symptoms

Yesterday you had AI do a ton of things. Today you want to review: “How exactly was that DuckDB timezone bug fixed?”

Steps

Step 1: Cross-session search
You: Search for the session where DuckDB timezone was fixed

AI:
  🛠 session_search(action="grep", query="DuckDB timezone")
  
  Found 2 matching sessions:
  1. 05-22 19:36 — DuckDB timezone configuration fix
  2. 05-20 14:30 — Database initialization discussion

Step 2: View timeline
You: Show the detailed process of the first one

AI:
  🛠 session_analyze(action="timeline", sessionId="...")
  
  📅 Timeline:
  [19:36] User: DuckDB query returns UTC time
  [19:37] AI: Reads db/connection.ts
  [19:38] AI: Discovers no timezone parameter set
  [19:39] AI: Adds SET timezone = 'Asia/Shanghai'
  [19:43] AI: All tests pass ✅

Step 3: View original conversation (if more detail needed)
You: Show the conversation around 19:39

AI:
  🛠 session_analyze(action="entries", msgRange="5-10", sessionId="...")

Common Analysis Pattern Combinations

Quick overview of a session:       summary → see what was done
Trace operation order:             timeline → see the steps
View raw conversation:             entries → see the details
Take over someone's work:          takeover → get the context
Check for violations:              audit → see if there are problems

Scenario 4: Tool Output Fills the Context — Aging Automatic Eviction

Symptoms

AI has used many tools in one session: read 20 files, ran 10 searches, executed 5 bash commands. Every tool’s returned result stays in the context.

The problem is: you only need the most recent results. That file you read 20 minutes ago is no longer needed now.

Solution: Aging Automatic Eviction

Aging automatically evicts tool outputs that haven’t been referenced again after a specified number of rounds:

Timeline:
  Round 1:  read(auth.ts) → 5K tokens
  Round 2:  read(middleware.ts) → 4K tokens
  Round 3:  grep("TODO") → 3K tokens
  ...
  Round 10: edit(auth.ts)  ← auth.ts is referenced again, "life extended"
  
  Round 1+8=9:  grep("TODO") → not referenced for 8 rounds, auto-evicted ✅
  Round 2+8=10: middleware.ts not referenced again, auto-evicted ✅
  Round 1+8=9:  auth.ts → referenced by edit, preserved!

Configuring Aging

/aging-config 8    # Evict after 8 rounds (recommended: 8-12)
/aging-config off  # Disable auto-eviction

💡 Skill file exemption: SKILL.md content is not evicted by aging — AI always retains access to currently loaded skills.

Manual Intervention: /context TUI Panel

If you don’t want to wait for auto-eviction, you can manually mark items for deletion:

1. Type /context to open the TUI panel
2. Browse by category: view all context content by tool type or chronological order
3. Select unwanted content, mark it for deletion
4. Marked content won't be included in the next AI request

This is especially useful when:

  • AI read a huge config file but you only need one line from it
  • A search returned 50 results but you only used 3
  • Error messages from earlier debugging are no longer needed

Scenario 5: Why Is AI Getting Slower — Token Budget Diagnosis

Symptoms

AI’s response keeps getting slower, and the wait time per conversation round is noticeably longer. You suspect the context is too large, but you don’t know exactly what’s taking up space.

Steps

Step 1: Enable recording
You: /record on

... do a few more rounds ...

Step 2: Check token budget
You: Analyze the token budget

AI:
  🛠 payload_analyze(action="budget")
  
  📊 Token Budget Analysis
  System Prompt:    4,200 (3.2%)
  Tool Definitions: 8,100 (6.2%)
  Memory Injection: 2,300 (1.8%)
  Conversation:    52,400 (40.0%)
  Tool Results:    64,800 (49.5%)  ← This is the big one!
  
Step 3: Find the most expensive calls
You: Find the most token-consuming tool calls

AI:
  🛠 payload_analyze(action="expensive", topN=5)
  
  Top 5 Most Expensive Tool Calls:
  1. read(src/database/schema.ts)  — 8,200 tokens
  2. code_graph_module_overview    — 6,400 tokens
  3. grep("TODO|FIXME")           — 4,100 tokens
  4. read(src/config/settings.ts)  — 3,800 tokens
  5. bash("npm test")             — 3,200 tokens

Step 4: Targeted optimization
→ schema.ts is too large, use offset/limit to read only the needed parts
→ Use compact=true mode for code_graph
→ Add --include to grep to limit file types

Diagnosis Flow Quick Reference

1. budget     → See overall distribution (which part has the highest share?)
2. expensive  → Find the big consumers (which specific calls use the most tokens?)
3. growth     → See the trend (which period had the fastest growth?)
4. messages   → Pinpoint (take a look at that specific message content)
5. Targeted optimization (switch tools, add filters, enable distill)

Combined Scenario: Complete Survival Strategy for Ultra-Long Sessions

Session starts
│
├── Rounds 1-20: Working normally, no concerns
│
├── Rounds 20-40: Context usage ~40%
│   → Enable /record on (optional, to prepare for later diagnosis)
│   → Avoid repeatedly reading large files
│
├── Rounds 40-60: Context starting to get tight
│   → Smart Compact takes over pi's compaction event (if auto mode is on)
│   → Or manually trigger /smart-compact
│   → Aging starts evicting old tool outputs
│
├── Rounds 60-80: Approaching the limit
│   → Smart Compact has completed two-phase compaction
│   → Consider whether to start a new session
│   → If continuing: check with payload_analyze budget
│
├── 💥 Crashed!
│   → Start a new session
│   → session_analyze(action="takeover") to take over
│   → Continue working
│
└── Before wrapping up:
    → /record off
    → Have AI generate a daily report with /journal
    → agent_end auto-reminds to commit + update memory

📖 Back to 5.1 Long Session Survival Guide for a complete tool introduction.

Automation & Workflows

You Might Have Encountered This

Every Friday afternoon, you do the same thing:

  1. Check all sessions from this week to see what files were changed
  2. Run tests to make sure there are no regressions
  3. Check for uncommitted code
  4. Write a weekly summary — recap the week’s progress

Each time you have to manually remind the AI to do these things. Sometimes you forget, and come back on Monday only to find that last Friday’s changes were never committed.

💡 An AI can do things, but it won’t “actively” do things. You need to tell it “what to do now.”

Two Tools: Scheduler and Workflow

pi-scheduler — Scheduled Tasks

The Scheduler lets the AI do specific things at specific times:

Scheduled trigger
     │
     ▼
┌──────────────────┐
│  Inject preset    │
│  message          │
│  "It's Friday PM, │
│   run weekly check" │
└──────────────────┘
     │
     ▼
  AI executes automatically

Supported schedule types:

TypeDescriptionExample
One-shotTriggers once after a specified time“Remind me about the meeting in 30 min”
RecurringRepeats at a fixed interval“Check tests every 2 hours”

pi-workflow — Sub-agent Orchestration

Workflow lets the AI break complex tasks into multiple sub-agents that execute in parallel:

Main agent: "Research best practices for XXX"
     │
     ├──→ Sub-agent 1: Search online resources
     ├──→ Sub-agent 2: Search GitHub source code
     └──→ Sub-agent 3: Search historical sessions
          │
          ▼
     Main agent: Synthesize results from all three sub-agents and give recommendations

Sub-agents are independent execution environments:

  • They have their own context window (no pollution of the main session)
  • They have their own tool set (permissions can be restricted)
  • They return results to the main agent when done

Real-world Example: Automated Weekly Report

Configuring Automatic Friday Checks

{
  "action": "create",
  "interval_ms": 604800000,
  "recurring": true,
  "prompt": "It's Friday afternoon. Please perform the following checks:\n1. Use session_search to review all sessions from this week\n2. Use session_analyze summary to summarize each session\n3. Check git status for uncommitted changes\n4. Compile a weekly summary and write it to .pi/journal/weekly-summary.md\n5. Remind the user to commit uncommitted code"
}

Every Friday, the AI will automatically:

  1. Search all sessions from the week
  2. Generate summaries for each one
  3. Check git status
  4. Generate a weekly report
  5. Remind you to commit

Everyday Use of Scheduled Reminders

User: "Remind me to check CI results in 30 minutes"
AI: ✅ Scheduled task created, reminder in 30 minutes

... 30 minutes later ...

AI: ⏰ Reminder: Time to check CI results. Current time: 15:30.

Real-world Example: Sub-agent Research

Scenario: Researching Best Practices for a New Technology

You say: “Research the performance differences between Bun and Node.js and give me a recommendation.”

The AI launches a research workflow:

🔬 Research workflow started

Sub-agent 1 (Search Expert):
  → Search "Bun vs Node.js performance benchmark 2026"
  → Found 5 technical articles
  → Extract key data points

Sub-agent 2 (Source Expert):
  → Search Bun's GitHub repository
  → Browse the benchmark directory
  → Review performance discussions in issues

Sub-agent 3 (History Expert):
  → Search project's historical sessions
  → Check if similar evaluations were discussed before

─────────────────────────

Main agent comprehensive report:

📊 Bun vs Node.js Recommendation

1. Performance comparison:
   - HTTP throughput: Bun is 3-4x faster
   - Startup time: Bun is 5x faster
   - npm compatibility: Node.js is better (Bun 95% compatible)

2. Recommendation for this project: Stick with Node.js
   - Rationale: The project depends on multiple Node.js native modules
   - Bun's compatibility issues could lead to extra maintenance costs
   - The performance difference has little impact on this project (I/O bound)

Advantages of Sub-agents

FeatureSingle Agent (Normal Chat)Multi-agent (Workflow)
Context isolationAll information mixed togetherEach sub-agent is independent
Parallel executionSequential, one by oneCan search in parallel
Error isolationOne error affects everythingSub-agent errors don’t affect others
Token efficiencyAll information in main contextOnly final results return to main context

Scheduler Configuration

Install via settings.json:

{
  "packages": ["pi-scheduler"]
}

The tool provides three operations:

OperationDescriptionParameters
createCreate a scheduled taskinterval_ms, prompt, recurring
listView all tasksNone
cancelCancel a taskid

Common Time Intervals

Intervalinterval_msUse case
30 minutes1,800,000Short-term reminders
2 hours7,200,000Periodic checks
Daily86,400,000Daily report / daily check
Weekly604,800,000Weekly report / weekly check

Workflow Configuration

Install via settings.json:

{
  "packages": ["pi-workflow"]
}

Workflow provides two core concepts:

  1. Factor Research: Multi-round search + evaluation + synthesis
  2. Factor Optimization: Initial screening + dissection + combination + iteration + validation

Usually you don’t need to interact with Workflow directly — the AI automatically decides whether to use sub-agents based on task complexity.

Best Practices

✅ Good Scheduled Task Design

  • Clear instructions: Tell the AI exactly what to do, avoid vague “check it out”
  • Reasonable intervals: Don’t check every 5 minutes (wastes resources)
  • Meaningful triggers: A reminder should say “do X now,” not “are you there”

✅ Good Sub-agent Design

  • Single responsibility: Each sub-agent should do only one thing
  • Clear output: Sub-agents should return structured results, not free-form text
  • Moderate parallelism: 3-5 sub-agents is the sweet spot — too many increases synthesis difficulty

❌ Common Pitfalls

  • “Scheduled tasks can replace all manual operations” → No, complex decisions still require human involvement
  • “More sub-agents is always better” → Too many sub-agents may cost more to synthesize than they save
  • “Workflow can do anything” → It’s great for research and analysis, not suitable for decisions requiring human judgment

Next Steps

So far, we’ve covered all the core tools provided by pi-atelier. But what if these tools aren’t enough — what if you want to build something that doesn’t exist yet?

In the next chapter, we’ll look at how to develop your own extensions.

6.2 pi-scheduler: Scheduled Tasks

pi-scheduler is the “alarm clock” of pi-atelier — it can automatically inject messages to the AI at specified times, enabling the AI to proactively execute tasks.

Why Scheduled Tasks?

The AI is reactive — it only answers when you ask. But some things need to happen on time:

  • “Remind me to check CI results in 30 minutes” — you might forget
  • “Check tests every 2 hours” — manual reminders are tiring
  • “Remind me to commit before leaving work every day” — afraid of forgetting

The Scheduler gives the AI “time awareness.”

How It Works

Create a scheduled task
     │
     ▼
┌──────────────────┐
│  Scheduler Timer  │
│  Countdown wait   │
└────────┬─────────┘
         │ Time's up!
         ▼
┌──────────────────┐
│  Inject preset    │
│  message into     │
│  AI's context     │
└────────┬─────────┘
         │
         ▼
     AI reads message
     Executes task automatically

Key points:

  • Injecting a message does not start a new conversation — it inserts a “reminder” into the current session
  • The AI decides for itself how to execute after seeing the message, no need for you to repeat it
  • Scheduled tasks are only valid in the current session; they are automatically cleared when the session ends. If the session is resumed later, tasks from that session can be restored

Three Operations

Creating a Task

schedule(
  action="create",
  interval_ms=1800000,    // 30 minutes
  prompt="Check CI build results, tell me if it fails",
  recurring=false          // One-shot
)

Parameter description:

ParameterDescriptionRequired
actionFixed as "create"
interval_msInterval in milliseconds
promptMessage to inject to the AI
recurringWhether to repeat (default: false)

Listing Tasks

schedule(action="list")

Result:
  📋 Current scheduled tasks:
  1. [One-shot] Trigger at 14:30 — "Check CI results"
  2. [Recurring] Every 2h — "Run tests for regressions"

Canceling a Task

schedule(action="cancel", id="task-123")

Common Time Intervals

ScenarioIntervalinterval_ms
Short-term reminder5 minutes300,000
Tea break15 minutes900,000
Waiting for build30 minutes1,800,000
Periodic check2 hours7,200,000
Daily reminder1 day86,400,000

One-shot vs Recurring Tasks

One-shot (recurring=false)

Best for “reminder” scenarios:

User: "Remind me about the meeting in 30 minutes"

AI → schedule(action="create", interval_ms=1800000,
              prompt="⏰ Reminder: Time for the meeting.", recurring=false)

... 30 minutes later ...

AI: ⏰ Reminder: Time for the meeting. Current time: 15:30.

Automatically deleted after being triggered once.

Recurring (recurring=true)

Best for “periodic check” scenarios:

User: "Run tests every 2 hours"

AI → schedule(action="create", interval_ms=7200000,
              prompt="Please run npm test and report results. If it fails, list the failing tests.",
              recurring=true)

... Every 2 hours ...

AI: 📋 Periodic test report: All 47 tests passed ✅
...
AI: ⚠️ Periodic test report: 2 tests failed!
    - auth.test.ts: Login timeout
    - api.test.ts: 404 error

Status Bar Integration

The Scheduler displays a countdown in pi’s status bar, so you always know when the next reminder will trigger.

Notes

  • Scheduled tasks only work in the current session — tasks disappear when the session is closed (but can be restored if the same session is resumed)
  • Recurring tasks shouldn’t have intervals that are too short (recommended ≥ 5 minutes), to avoid frequent triggers wasting tokens
  • The prompt should be specific and clear — the AI sees this exact text; vague instructions lead to vague execution

Next Steps

📖 Return to 6.1 Automation & Workflows for complete usage examples.

6.3 pi-workflow: Sub-agent Orchestration

pi-workflow is the “task dispatcher” of pi-atelier — it lets the AI break complex tasks into multiple sub-agents that execute in parallel, then synthesizes the results.

Why Sub-agents?

A single AI session has a limited context window. When a task requires:

  • Searching multiple information sources simultaneously
  • Executing independent operations without interference
  • Reviewing the same code from different perspectives

A single AI can only process sequentially, which is inefficient, and all the information gets crammed into one context.

Sub-agents solve this — each sub-agent has its own independent context window, and only returns the final result to the main agent.

How It Works

Main agent
  │ "Research performance differences between Bun and Node.js"
  │
  ├──→ Sub-agent 1 (Search Expert)
  │     Independent context window
  │     Search online resources
  │     Return key data points
  │
  ├──→ Sub-agent 2 (Source Expert)
  │     Independent context window
  │     Search GitHub repository
  │     Return benchmark data
  │
  └──→ Sub-agent 3 (History Expert)
        Independent context window
        Search historical sessions
        Return previous discussion records
  │
  ▼
Main agent synthesizes three results → Output final recommendation

Sub-agent Characteristics

FeatureDescription
Independent contextEach sub-agent has its own context window, won’t pollute the main session
Independent tool setCan restrict which tools each sub-agent can use
Error isolationOne sub-agent failing doesn’t affect others
Token efficiencyOnly final results return to the main context, intermediate steps don’t occupy the main window

Available Sub-agents

Sub-agents are defined in ~/.pi/agent/agents/*.md — each .md file defines a sub-agent’s role and tool set:

Sub-agentPurposeTools
pv-explorerCode exploration — analyze architecture, call chains, design patternsread, grep, find, ls
pv-reviewerIndependent plan review — check architecture violations, dependency direction errorsread, grep, find, ls
pv-executorExecute code changes — implement according to plan, make tests passAll tools
pv-simplifierCode simplification — identify duplication, inefficient patternsread, grep, find, ls
fo-analyzerFactor analysis — run analysis scripts + parse resultsbash, read
fo-verifierFactor verification — run final backtest + output reportbash, read
fr-searcherFactor search — search literature and source coderead, grep, find, ls
fr-writerFactor writing — synthesize research findingsread, grep, find, ls
security-auditorSecurity audit — check for security vulnerabilitiesread, grep, find, ls

Use Cases

Scenario 1: Plan Review (Plan-Verify Flow)

Before making complex changes, use pv-reviewer to independently review the plan:

Main agent: I've designed a refactoring plan, let the reviewer check it.
  │
  └──→ pv-reviewer
        "Review the following plan: Switch the auth module from JWT to session-based"

        Review result:
        ✅ Dependency direction is correct
        ⚠️ Violates data model invariant #3 (user session should be immutable)
        ❌ Missing test coverage — new session storage needs integration tests

Value: The reviewer looks at the plan from an independent perspective and can catch issues that the main agent “can’t see from inside.”

Scenario 2: Code Exploration

Use pv-explorer for structured code analysis without polluting the main context:

Main agent: I need to understand the architecture of this module.
  │
  └──→ pv-explorer
        "Analyze the architecture, call chains, and design patterns of src/auth/"

        Returns:
        - Module structure diagram
        - Core function call chains
        - Design patterns used (middleware chain, strategy pattern)
        - Dependency direction

Value: The exploration process may read dozens of files — if all of them were in the main context, they’d fill it up. The sub-agent only returns the essence.

Scenario 3: Security Audit

Use security-auditor to check code for security vulnerabilities:

Main agent: This code handles user input, help me do a security audit.
  │
  └──→ security-auditor
        "Audit input validation and injection risks in src/api/handlers/"

        Findings:
        ⚠️ SQL injection risk: query parameters directly concatenated into SQL
        ⚠️ XSS risk: user input returned as HTML without escaping
        ✅ Auth checks: all endpoints have auth middleware

Sub-agents vs Normal Chat

DimensionNormal ChatSub-agent
ContextAll information in one windowEach sub-agent has its own window
ParallelismSequential processingCan be parallel
Error impactGlobal impactIsolated
Token consumptionIntermediate steps all occupy main windowOnly results occupy main window
Best forSimple tasks, Q&AComplex research, multi-party review

💡 Rule of thumb: If a task requires reading many files but ultimately needs just one conclusion, use sub-agents. If it requires multi-turn interactive discussion, use normal chat.

Notes

  • Sub-agents are one-shot — they return results and terminate, they don’t maintain state
  • Sub-agents have restricted tool setspv-explorer only has read-only tools and cannot modify files
  • Sub-agents cannot see the full context of the main session — they only see the task description you pass to them
  • Therefore, the task description passed to a sub-agent must be sufficiently detailed — include all necessary background information

Next Steps

📖 Return to 6.1 Automation & Workflows for complete usage examples.

Write Your Own Extensions

Why Write Your Own Extension?

pi-atelier provides 10 extensions covering core scenarios like memory, planning, rules, retrospective, compression, and automation. But every project has its own special needs:

  • Your team uses Feishu instead of Slack, so you need a Feishu notification extension
  • You’re doing game development and need an extension to automatically manage the assets directory
  • You’re writing academic papers and need an extension for LaTeX compilation + citation checking

💡 At its core, an extension is about giving AI new tools and new knowledge.

Extension Architecture

What Makes Up an Extension?

pi-xxx/
├── package.json        # Package metadata + pi extension configuration
├── index.ts            # Entry point, registers tools and hooks
├── lib/                # Tool implementations
│   └── tools-xxx.ts
├── prompts/            # Prompt templates (descriptions visible to AI)
│   └── xxx-description.md
└── README.md           # Documentation

Core Concepts

ConceptDescriptionAnalogy
ToolA function AI can callGiving AI a new hammer
HookLogic executed at specific momentsGiving AI an alarm clock
PromptDescription of the tool (what AI sees)Telling AI how to use this hammer
ConfigUser-configurable parametersThe hammer’s force adjustment

Extension Lifecycle

1. pi starts
     │
     ▼
2. Load packages from settings.json
     │
     ▼
3. Install/update extensions (npm or git)
     │
     ▼
4. Execute extension entry function `export default function(pi)`
     │
     ├── Register tools (pi.registerTool)
     ├── Register commands (pi.registerCommand)
     └── Listen to events (pi.on)
     │
     ▼
5. AI session can now call the new tools

Hands-On: Writing a “Code Stats” Extension from Scratch

Let’s build a simple extension step by step — counting lines of code in a project.

Step 1: Create the Project

mkdir pi-code-stats
cd pi-code-stats
npm init -y

Modify package.json:

{
  "name": "pi-code-stats",
  "version": "0.1.0",
  "main": "index.ts",
  "piExtension": true
}

💡 "piExtension": true tells pi this is an extension package. "main" points to the entry file (TypeScript or JavaScript — both work; pi uses jiti to load them).

Step 2: Write the Tool Implementation

lib/tools-stats.ts:

import { execSync } from 'child_process';

export function countLines(directory: string, extension: string): {
  total: number;
  files: { path: string; lines: number }[];
} {
  const cmd = `find ${directory} -name "*.${extension}" -not -path "*/node_modules/*" -not -path "*/.git/*"`;
  const files = execSync(cmd).toString().trim().split('\n');
  
  const results = files.map(file => ({
    path: file,
    lines: Number(execSync(`wc -l < ${file}`).toString().trim())
  }));
  
  return {
    total: results.reduce((sum, r) => sum + r.lines, 0),
    files: results.sort((a, b) => b.lines - a.lines)
  };
}

Step 3: Write the Entry File

index.ts:

import type { ExtensionAPI } from '@earendil-works/pi-coding-agent';
import { countLines } from './lib/tools-stats';

export default function (pi: ExtensionAPI) {
  pi.registerTool({
    name: 'code_stats',
    label: 'Code Stats',
    description: 'Count lines of code in a project. Use when the user says "count code" or "how many lines of code".',
    promptSnippet: 'Count lines of code in a project.',
    parameters: {
      type: 'object',
      properties: {
        directory: {
          type: 'string',
          description: 'The directory path to count'
        },
        extension: {
          type: 'string',
          description: 'File extension, e.g. ts, py, rs'
        }
      },
      required: ['directory']
    },
    async execute(_toolCallId: string, params: any): Promise<any> {
      const result = countLines(params.directory, params.extension || 'ts');
      return {
        totalLines: result.total,
        fileCount: result.files.length,
        topFiles: result.files.slice(0, 10)
      };
    }
  });
}

Step 4: Write the Tool Description

prompts/stats-description.md:

Count lines of code in a project.

Parameters:
- directory (required): The directory path to count
- extension (optional): File extension, defaults to ts

Returns:
- totalLines: Total line count
- fileCount: Number of files
- topFiles: Top 10 largest files

Example:
  code_stats(directory="src", extension="ts")
  → { totalLines: 12340, fileCount: 45, topFiles: [...] }

Step 5: Install and Test

// settings.json
{
  "packages": [
    "./path/to/pi-code-stats"
  ]
}

Restart pi, and the AI will be able to use the code_stats tool.

pi-shared-utils: Your Toolbox

When writing extensions, you don’t have to start from scratch every time. pi-shared-utils provides a set of common utility functions:

ModuleFunctionWhen to Use
loggerUnified logging formatWhen you need to print debug info
storageCross-session persistent storageWhen you need to save configuration or state
pathsUnified path handlingWhen you need to find file locations
jsonSafe JSON read/writeWhen you need to manipulate JSON files
validatorParameter validationWhen you need to validate tool parameters
settings-backupsettings.json backup and rollbackWhen you need to safely write config
file-lockFile locks (proper-lockfile wrapper)When you need to prevent race conditions
configThree-layer config merging (defaults → global → project)When your extension needs configurable parameters

Usage Example

import { logger, storage, paths } from '@pi-atelier/shared-utils';

// Logging
logger.info('Extension activated');
logger.warn('Missing configuration file, using defaults');

// Paths
const projectRoot = paths.getProjectRoot();
const memoryDir = paths.getMemoryDir();

Configuration API Example

If your extension needs user-configurable parameters:

import { getEffectiveConfig } from '@pi-atelier/shared-utils';

const defaults = { threshold: 1000, enabled: true };
const config = getEffectiveConfig('my-extension', defaults, cwd);
// config = final configuration after three-layer merge

Debugging Your Extension

Common issues during extension development: the tool is registered but AI doesn’t call it, errors occur in the handler without visible logs, or the returned result isn’t what was expected.

Viewing Log Output

Output from logger.info() and console.log() in your extension appears in pi’s terminal window (not the chat window). Debugging steps:

# Start pi in the terminal (not in the background) to see all log output
pi

# Then ask the AI to call your tool in the chat window
# The terminal will display the log output

Confirming Tool Registration

In the pi chat, directly ask the AI:

What tools do you have available? Can you see code_stats?

If the AI can’t see your tool, check:

  • Does package.json have "piExtension": true?
  • Is the package path in settings.json correct?
  • Is the entry function exported correctly (export default function(pi))?

Common Issue Troubleshooting

IssueCauseSolution
AI can’t see the toolMissing piExtension fieldAdd "piExtension": true to package.json
Tool call errorsException in handlerCheck the error stack in terminal logs
AI doesn’t call the toolDescription is too vagueMake the tool description more specific, include parameter details and examples
Empty return valueAsync operation not awaitedAdd async to handler, add await to calls
Path not foundRelative path issuesUse paths.getProjectRoot() to get absolute paths

💡 Tip: During extension development, you can add console.log(JSON.stringify(args, null, 2)) at the beginning of your handler to print the parameters and see what the AI is passing in.

Publishing Your Extension

Publishing to npm

# 1. Confirm package.json info is complete
npm version patch  # 0.1.0 → 0.1.1

# 2. Publish
npm publish --access public

Installing After Publishing

Other users can add your package name to their settings.json:

{
  "packages": [
    "pi-code-stats"
  ]
}

Pre-Publishing Checklist

  • package.json has a complete description and keywords
  • README.md is written following the template (installation, usage, configuration, examples)
  • Has a LICENSE file
  • Has a CHANGELOG.md
  • Code has basic error handling
  • Tool descriptions (prompts/*.md) are clear and complete

Extension Development Best Practices

✅ Good Extension Design

  1. Single Responsibility: One extension does one thing — don’t cram all functionality into a single package
  2. Description as Documentation: Write tool descriptions clearly enough that the AI doesn’t have to guess
  3. Parameter Validation: Validate parameters in the handler and provide meaningful error messages
  4. Idempotent Operations: Same input should produce the same output — avoid side effects

✅ Writing Good Tool Descriptions

# Good description
Count lines of code for a specific file type in a given directory.
Parameters:
- directory (required): directory path
- extension (optional): file suffix, defaults to "ts"
Returns: { totalLines, fileCount, topFiles }
# Bad description
Count code

❌ Common Mistakes

  • Tool name too generic: analyze → should be code_stats
  • Description too brief: AI doesn’t know how to use it and will pass wrong parameters
  • Forgetting error handling: crashes when file doesn’t exist
  • Return value too large: returning the entire file content → should return a summary

Appendix: pi Extension API Quick Reference

registerTool

pi.registerTool({
  name: string,           // Tool name (unique identifier, AI uses this to call it)
  label: string,          // Display name (shown in TUI)
  description: string,    // Tool description (what AI sees, determines when AI calls it)
  promptSnippet?: string, // Short description (injected into AI system prompt, if empty won't appear in Available tools)
  promptGuidelines?: string[], // AI usage guidelines
  parameters: TypeBox.Object({...}),  // Parameter definition (TypeBox schema)
  renderShell?: "default" | "self",  // Render mode (default "default")
  executionMode?: "sequential" | "parallel", // Execution mode
  async execute(
    toolCallId: string,    // Tool call ID
    params: any,           // Parameters passed by AI
    signal: AbortSignal | undefined,  // Cancel signal
    onUpdate: Function | undefined,   // Streaming update callback
    ctx: ExtensionContext  // Execution context
  ): Promise<{ content: [{ type: "text", text: string }], details: any }>
});

registerCommand

pi.registerCommand(name: string, {
  description: string,    // Command description
  getArgumentCompletions?: (prefix: string) => AutocompleteItem[],  // Argument autocomplete
  handler: async (args: string, ctx: ExtensionCommandContext) => {
    // args: text entered by user (text after /command)
    // ctx.ui.notify(message, level): Show notification
    // ctx.compact(): Trigger compaction
    // ctx.switchModel(model): Switch model
  }
});

registerShortcut

pi.registerShortcut(shortcut: string, {
  description: string,
  handler: async (ctx: ExtensionContext) => { ... }
});

Event Listeners

// Session lifecycle
pi.on('session_start', (event, ctx) => { ... });
pi.on('session_shutdown', (event, ctx) => { ... });
pi.on('session_before_switch', (event, ctx) => { ... });
pi.on('session_before_fork', (event, ctx) => { ... });
pi.on('session_before_tree', (event, ctx) => { ... });
pi.on('session_tree', (event, ctx) => { ... });

// Compaction
pi.on('session_before_compact', (event, ctx) => { ... });
pi.on('session_compact', (event, ctx) => { ... });

// AI interaction
pi.on('before_provider_request', (event, ctx) => { ... });
pi.on('after_provider_response', (event, ctx) => { ... });
pi.on('context', (event, ctx) => { ... });

// Agent lifecycle
pi.on('before_agent_start', (event, ctx) => { ... });
pi.on('agent_start', (event, ctx) => { ... });
pi.on('agent_end', (event, ctx) => { ... });
pi.on('turn_start', (event, ctx) => { ... });
pi.on('turn_end', (event, ctx) => { ... });

// Messages
pi.on('message_start', (event, ctx) => { ... });
pi.on('message_update', (event, ctx) => { ... });
pi.on('message_end', (event, ctx) => { ... });

// Tool execution
pi.on('tool_call', (event, ctx) => { ... });
pi.on('tool_result', (event, ctx) => { ... });
pi.on('tool_execution_start', (event, ctx) => { ... });
pi.on('tool_execution_update', (event, ctx) => { ... });
pi.on('tool_execution_end', (event, ctx) => { ... });

// Other
pi.on('model_select', (event, ctx) => { ... });
pi.on('thinking_level_select', (event, ctx) => { ... });
pi.on('input', (event, ctx) => { ... });
pi.on('user_bash', (event, ctx) => { ... });
pi.on('resources_discover', (event, ctx) => { ... });

Helper Methods

pi.sendMessage(message, options?);     // Send custom message to session
pi.appendEntry(role, content);           // Append a message to the session
pi.registerFlag(name, options);          // Register CLI flag
pi.getFlag(name);                        // Get CLI flag value
pi.registerMessageRenderer(type, renderer); // Register custom message renderer

Congratulations, You’ve Made It!

Now you understand all the core concepts of pi-atelier:

  1. Memory (pi-memory) — Let AI remember knowledge
  2. Planning (pi-roadmap) — Let AI manage tasks
  3. Rules (pi-shepherd + pi-context-manager) — Let AI follow rules and control information quality
  4. Retrospective (pi-session-analyzer + pi-journal) — Let AI record and review work
  5. Compression & Diagnostics (pi-smart-compact + pi-context-manager) — Keep AI smart in long sessions
  6. Automation (pi-scheduler + pi-workflow) — Let AI work proactively
  7. Extensions (pi-shared-utils + your own extensions) — Make AI capable of anything

Feel free to submit Issues and PRs on GitHub — let’s make the AI coding assistant better together!

Appendix

A. Extension Quick Reference

ExtensionInstall CommandCore Tools/CommandsOne-Liner Purpose
pi-memory"pi-memory"memory_update, memory_indexCross-session knowledge persistence
pi-roadmap"pi-roadmap"roadmap_plan, roadmap_next, roadmap_done, roadmap_search, roadmap_update, etc.Task breakdown, progress tracking, dependency management
pi-shepherd"pi-shepherd"shepherd_rules, Rule-driven hook engineAI behavior guard
pi-context-manager"pi-context-manager"payload_analyze, /record, /context, /distill-config, /aging-config, etc.Context quality control + Token diagnostics
pi-session-analyzer"pi-session-analyzer"session_search, session_analyzeHistorical session search and review
pi-smart-compact"pi-smart-compact"/smart-compact, /smart-compact-configIntelligent long-session compression
pi-scheduler"pi-scheduler"schedule, /loop, /remind, /tasksScheduled tasks and reminders
pi-workflow"pi-workflow"registerWorkflowTool (called by other extensions)Workflow framework library
pi-shared-utils"pi-shared-utils"logger, storage, paths, json, validator, settings-backup, file-lockExtension development utility library
pi-journal"pi-journal"/journal, journalLog report generation (git activity + session events + memory changes)

Personal Projects (Lightweight Combo)

{
  "packages": [
    "pi-memory",
    "pi-roadmap",
    "pi-smart-compact"
  ]
}

Core three: Remember knowledge + Manage tasks + Stay smart in long sessions.

Team Projects (Standard Combo)

{
  "packages": [
    "pi-memory",
    "pi-roadmap",
    "pi-shepherd",
    "pi-session-analyzer",
    "pi-smart-compact"
  ]
}

Adds rules and retrospective capabilities.

Large Refactors (Full Combo)

{
  "packages": [
    "pi-memory",
    "pi-roadmap",
    "pi-shepherd",
    "pi-context-manager",
    "pi-session-analyzer",
    "pi-smart-compact",
    "pi-scheduler"
  ]
}

Full installation, fully leveraging diagnostics and automation capabilities.

C. pi Internal Mechanics Overview

Compaction

pi has a built-in context compression mechanism. When the conversation history approaches the context window limit, pi automatically compresses older conversations. The Smart Compact extension enhances this mechanism — it identifies critical information (decisions, conventions, conclusions) and prioritizes preserving it.

Distill

Tool results can be very large (e.g., reading a 1000-line file). pi has a built-in distill mechanism to compress tool output. The pi-context-manager extension provides:

  • Auto Distill: Automatically compresses tool output exceeding the threshold (/distill-config)
  • First Full Content Cap: firstSeenCap (/distill-config --cap) limits the initial output size
  • Tool Result Processor: Format-specific streamlining for certain tool types (/processor-config)
  • Aging: Automatically evicts old tool output (/aging-config)

Tool Call Lifecycle

1. AI decides to call a tool
     │
     ▼
2. Shepherd tool_call hook (rewrite / block / notify / steer)
     │
     ▼
3. Execute tool
     │
     ▼
4. Context Manager distill + processor processes the return value
     │
     ▼
5. Shepherd tool_result hook (notify / steer)
     │
     ▼
6. Result returned to AI
     │
     ▼
7. AI generates reply
     │
     ▼
8. Shepherd message_end hook (steer — matches AI reply text)

Session Storage

All session data is stored under the ~/.pi/ directory:

~/.pi/
├── roadmap/              # Global roadmaps
└── agent/
    ├── settings.json         # Global config (installed extensions, providers)
    ├── mcp.json              # MCP server configuration
    ├── memory/               # Global memory files (L1)
    ├── skills/               # Global skills
    ├── extensions/           # Inline extensions
    ├── agents/               # Sub-agent definitions
    ├── npm/node_modules/     # npm-installed extension packages
    ├── git/                  # Git package installation location
    ├── sessions/             # Session history records (JSONL)
    ├── distill/              # context-manager data
    │   └── recordings/       # Payload recordings

{project}/.pi/
├── settings.json         # Project-level config (overrides global)
├── memory/               # Project-level memory (L2)
└── roadmap/              # Project-level roadmaps

D. Frequently Asked Questions

Q: Extension not taking effect after installation?

Check:

  1. Whether settings.json format is correct (JSON syntax)
  2. Whether the package name is spelled correctly
  3. Restart pi (extensions need a restart to be loaded)

Q: Too many memory files?

pi-memory automatically checks the file count. It’s recommended to clean up when exceeding 25 files; writes are refused beyond 40. Cleanup methods:

  1. Merge multiple files on the same topic
  2. Delete outdated memories
  3. Split large files into smaller ones

Q: Shepherd rules not working?

Check:

  1. Global rules are in the pi-shepherd package’s rules.json
  2. Project rules go in .pi/shepherd-rules-*.json (note the file name prefix)
  3. Confirm "enabled": true in the rule

Q: Token consumption too fast?

  1. Use payload_analyze with budget and expensive modes to identify token hogs
  2. Use compact mode for searches (semantic_code_search(compact: true))
  3. Lower the distill threshold (/distill-config)
  4. Configure aging to auto-evict old content (/aging-config)

Q: payload_analyze reports “no recordings”?

You need to enable recording first: /record on. Use normally while recording, then /record off when done.

中文