Rikugan Architecture - Agent Internals

High-Level Overview

Rikugan is a generator-based agentic loop embedded inside IDA Pro and Binary Ninja. The agent runs in a background thread and communicates with the Qt UI via a stream of TurnEvent objects. All host API calls (IDA/BN) are marshalled to the main thread via @idasync.

Key Files

File	Purpose
`rikugan/agent/loop.py`	AgentLoop + BackgroundAgentRunner
`rikugan/agent/turn.py`	TurnEvent / TurnEventType definitions
`rikugan/tools/base.py`	`@tool` decorator, ToolDefinition
`rikugan/tools/registry.py`	ToolRegistry
`rikugan/ui/panel_core.py`	RikuganPanelCore (Qt UI)
`rikugan/ui/session_controller_base.py`	SessionControllerBase

The Agentic Loop

File: rikugan/agent/loop.py

AgentLoop.run() Pipeline

The entry point is AgentLoop.run(), a Python generator. It yields TurnEvent objects that the UI consumes.

Command Detection

/plan, /modify, /explore, /undo, /memory, /mcp, /doctor

Skill Resolution

Match slug, inject prompt, filter tools

System Prompt

Host base + binary info + memory + skill

Turn Loop

Stream → Parse → Execute → Yield events

Compaction

Context window check at 80% threshold

Turn Loop Detail

while True:
    yield TURN_START
    stream = provider.chat_stream(messages, tools, system)
    text, tool_calls = _stream_llm_turn(stream)   # yields TEXT_DELTA events
    yield TEXT_DONE

    if not tool_calls:
        break   # LLM is done, no more tool calls

    results = _execute_tool_calls(tool_calls)      # yields TOOL_RESULT events
    append results to messages
    yield TURN_END

_stream_llm_turn()

Consumes the provider's chat_stream() generator. Each StreamChunk is processed:

Chunk Field	Action	Event Yielded
`chunk.text`	Accumulate text	`TEXT_DELTA`
`chunk.is_tool_call_start`	Start tool accumulation	`TOOL_CALL_START`
`chunk.tool_args_delta`	Accumulate JSON args	`TOOL_CALL_ARGS_DELTA`
`chunk.is_tool_call_end`	Finalize ToolCall	`TOOL_CALL_DONE`
`chunk.usage`	Update context manager	`USAGE_UPDATE`

_execute_tool_calls()

For each ToolCall in the batch:

Pseudo-tool check — Handled inline with continue
Approval gate — execute_python requires user approval
Pre-state capture — For mutating tools, capture_pre_state() records current state
Execution — ToolRegistry.execute(name, args) dispatches to the handler
Mutation recording — If defn.mutating, creates MutationRecord, yields MUTATION_RECORDED
Error handling — ToolError and Exception caught, returned as error results
Result — Each result becomes a ToolResult and yields TOOL_RESULT

BackgroundAgentRunner

class BackgroundAgentRunner:
    def start(self, user_message):
        self._thread = Thread(target=self._run, args=(user_message,))
        self._thread.start()

    def _run(self, message):
        for event in self.agent_loop.run(message):
            self._event_queue.put(event)

    def get_event(self, timeout=0):
        return self._event_queue.get(timeout=timeout)

TurnEvent System

File: rikugan/agent/turn.py — All communication from the agent loop to the UI goes through TurnEvent objects. Each event type has a static factory method for clean construction.

TEXT_DELTA

Streaming text token from LLM

text

TEXT_DONE

Full assistant text complete

text

TOOL_CALL_START

LLM requested a tool call

tool_call_id, tool_name

TOOL_CALL_ARGS_DELTA

Streaming tool arguments

tool_call_id, tool_args

TOOL_CALL_DONE

Tool call arguments finalized

tool_call_id, tool_name, tool_args

TOOL_RESULT

Tool execution result returned

tool_call_id, tool_name, tool_result

TURN_START

New turn begins

turn_number

TURN_END

Turn complete

turn_number

ERROR

Error occurred in the loop

error

CANCELLED

User cancelled the operation

—

USAGE_UPDATE

Token usage update

usage (TokenUsage)

USER_QUESTION

Agent asks user a question

text, metadata.options

PLAN_GENERATED

Plan mode: plan ready

plan_steps

PLAN_STEP_START

Executing plan step

plan_step_index, text

PLAN_STEP_DONE

Plan step complete

plan_step_index, text

TOOL_APPROVAL_REQUEST

Script approval needed

tool_call_id, tool_name, text

EXPLORATION_PHASE_CHANGE

Phase transition in explore mode

metadata.from_phase, to_phase

EXPLORATION_FINDING

Discovery logged

text, metadata.category, address

PATCH_APPLIED

Binary patch applied

metadata.address, original, new

PATCH_VERIFIED

Patch verified by decompiler

metadata.address, success

SAVE_APPROVAL_REQUEST

Save gate reached

metadata.patch_count, total_bytes

SAVE_COMPLETED

Patches saved to file

metadata.patch_count

SAVE_DISCARDED

Patches discarded

metadata.rolled_back

MUTATION_RECORDED

Mutation logged for undo

tool_name, metadata.reversible

Tool Framework

Files: rikugan/tools/base.py, rikugan/tools/registry.py

@tool Decorator

@tool(category="annotations", mutating=True)
def rename_function(
    old_name: Annotated[str, "Current function name"],
    new_name: Annotated[str, "New name to assign"],
) -> str:
    """Rename a function in the database."""
    # ... implementation

The decorator inspects the function signature, extracts parameter descriptions from Annotated metadata, generates a ToolDefinition with JSON schema, and wraps the handler with @idasync for IDA thread-safety.

ToolDefinition

@dataclass
class ToolDefinition:
    name: str
    description: str
    parameters: List[ParameterSchema]
    category: str = "general"
    requires_decompiler: bool = False
    mutating: bool = False              # marks tool as modifying the database
    timeout: Optional[float] = None     # per-tool timeout in seconds
    handler: Optional[Callable] = None

Argument Coercion

The registry automatically coerces arguments during execution:

Input	Target	Result
`"0x401000"`	`int`	`4198400`
`"true"` / `"false"`	`bool`	`True` / `False`
`0` / `1`	`bool`	`False` / `True`

Tool Categories

Category	Examples
Navigation	`get_cursor_position`, `jump_to`, `get_name_at`
Functions	`list_functions`, `search_functions`, `get_function_info`
Strings	`list_strings`, `search_strings`
Database	`list_segments`, `list_imports`, `list_exports`, `read_bytes`
Disassembly	`read_disassembly`, `read_function_disassembly`
Decompiler	`decompile_function`, `get_pseudocode`
Xrefs	`xrefs_to`, `xrefs_from`, `function_xrefs`
Annotations	`rename_function`, `set_comment`, `set_type`
Types	`create_struct`, `modify_struct`, `set_function_prototype`
Scripting	`execute_python` (requires approval)
Microcode (IDA)	`get_microcode`, `nop_microcode`
IL (BN)	`get_il`, `get_cfg`, `il_replace_expr`, `il_set_condition`, `nop_instructions`, `patch_branch`

Timeout Wrapping

future = _executor.submit(defn.handler, **arguments)
result = future.result(timeout=timeout)  # default 30s

Pseudo-Tools

Pseudo-tools are tool schemas injected into the LLM's tool list but handled directly in _execute_tool_calls() rather than dispatched through the registry. They use a continue statement to skip normal execution.

exploration_report

Logs structured findings during exploration. Categories: function_purpose, hypothesis, data_structure, constant, string_ref, import_usage, patch_result. When category="patch_result", also creates a PatchRecord.

phase_transition

Requests a phase change in exploration mode. Validates via ExplorationState.can_transition_to(). Denied if KnowledgeBase lacks minimum findings.

save_memory

Persists a fact to RIKUGAN.md in the IDB/BNDB directory. Categories: function_purpose, architecture, naming_convention, prior_analysis, general.

spawn_subagent

Creates an isolated SubagentRunner with its own SessionState. Accepts task and max_turns parameters.

exploration_report Schema

{
  "category": "function_purpose|hypothesis|patch_result|...",
  "summary": "Description of the finding",
  "address": 4198400,
  "function_name": "main",
  "relevance": "high|medium|low",
  "original_hex": "74 05",
  "new_hex": "75 05"
}

Skill System

Files: rikugan/skills/loader.py, rikugan/skills/registry.py

Skill Format

Skills are Markdown files with YAML frontmatter:

---
name: Malware Analysis
description: Windows PE malware analysis workflow
tags: [malware, windows]
allowed_tools: [decompile_function, list_imports, search_strings]
mode: exploration
---
Task: Analyze this binary as potential malware.

## Approach
1. Check imports for suspicious APIs...

Discovery

SkillRegistry.discover() scans:

Built-in skills: rikugan/skills/builtins/*/SKILL.md
User skills (IDA Pro): ~/.idapro/rikugan/skills/*/SKILL.md (Linux / macOS) · %APPDATA%\Hex-Rays\IDA Pro\rikugan\skills\*\SKILL.md (Windows)
User skills (Binary Ninja): ~/.binaryninja/rikugan/skills/*/SKILL.md (Linux) · ~/Library/Application Support/Binary Ninja/rikugan/skills/*/SKILL.md (macOS) · %APPDATA%\Binary Ninja\rikugan\skills\*\SKILL.md (Windows)

Reference files in references/*.md subdirectories are automatically appended to the skill body.

Built-in Skills

/malware-analysisWindows PE triage

/linux-malwareELF malware analysis

/deobfuscationCFF, opaque predicates

/vuln-auditBuffer overflows, fmt string

/driver-analysisWindows kernel drivers

/ctfCTF challenge solving

/generic-reGeneral reverse engineering

/ida-scriptingIDAPython API reference

/binja-scriptingBN Python API reference

/modifyAutonomous binary modification

/smart-patch-idaIDA-specific patching

/smart-patch-binjaBN-specific patching

Exploration Mode

Files: rikugan/agent/exploration_mode.py, rikugan/agent/loop.py

Exploration mode is a 4-phase autonomous agent flow for binary modification.

EXPLORE

Investigate binary, accumulate findings in KnowledgeBase

PLAN

Synthesize findings into concrete modification plan

EXECUTE

Apply patches in-memory for each planned change

SAVE

User approval gate before persisting changes

Phase 1: EXPLORE

The agent autonomously investigates the binary to understand the user's goal.

Triggered by /modify <goal>, /explore <goal>, or skills with mode: exploration
Uses all analysis tools + exploration_report + phase_transition pseudo-tools
Findings accumulated in KnowledgeBase: relevant functions, structured findings, hypotheses
Turn limit: 30 turns (max_explore_turns)
For /modify: runs as a subagent (isolated context window)

Phase Transition Gate

KnowledgeBase.has_minimum_for_planning

To move EXPLORE → PLAN, all of these must be true:

At least 1 relevant function discovered
At least 1 hypothesis formed
At least 1 hypothesis with relevance="high"

If the gate fails, the agent receives a gap description and continues exploring.

Phase 2: PLAN

Receives PLAN_SYNTHESIS_PROMPT with KnowledgeBase.to_summary(). Outputs a numbered list of changes, each with target address, current/proposed behavior, and patch strategy. User must approve before execution.

Phase 3: EXECUTE

Iterates over ModificationPlan.changes. Each step activates the platform-specific patching skill. After each patch, exploration_report(category="patch_result") creates a PatchRecord.

Phase 4: SAVE

Emits SAVE_APPROVAL_REQUEST with patch details. User responds "Save All" or "Discard All". Discard rolls back by writing PatchRecord.original_bytes back.

/explore vs /modify

Aspect	/explore	/modify
Phases	EXPLORE only	EXPLORE → PLAN → EXECUTE → SAVE
Subagent	No (inline)	Yes (Phase 1 in subagent)
Patching	No	Yes
Knowledge base	Accumulated, returned	Accumulated, passed to Phase 2

ExplorationState

@dataclass
class ExplorationState:
    phase: ExplorationPhase
    knowledge_base: KnowledgeBase
    modification_plan: Optional[ModificationPlan]
    patches_applied: List[PatchRecord]
    explore_turns: int
    execute_turns: int
    total_turns: int        # monotonic counter for UI
    max_explore_turns: int  # default 30
    max_execute_turns: int  # default 20
    explore_only: bool      # True for /explore (no patching)

Plan Mode

Files: rikugan/agent/plan_mode.py, rikugan/agent/loop.py

Plan mode is a simpler two-step workflow: plan first, then execute. Triggered by /plan <message>.

Plan Generation

Plan Parsing

User Approval

Step-by-Step Execution

Plan Generation — LLM receives _PLAN_GENERATION_PROMPT, outputs a numbered list
Plan Parsing — parse_plan() extracts numbered steps from text
User Approval — PLAN_GENERATED event; user approves or rejects
Step Execution — For each step: emit PLAN_STEP_START, run a full turn cycle, emit PLAN_STEP_DONE

Subagents

File: rikugan/agent/subagent.py

Subagents are isolated AgentLoop instances with their own SessionState. They keep the parent's context window clean from verbose tool output.

class SubagentRunner:
    def run_task(self, task, max_turns=20) -> Generator[TurnEvent, None, str]:
        # General-purpose: returns final text
        loop = AgentLoop(provider, tools, config, fresh_session)
        for event in loop.run(augmented_task):
            yield event
        return final_text

    def run_exploration(self, user_goal, max_turns=30) -> Generator[..., None, KnowledgeBase]:
        # Phase 1 specific: returns KnowledgeBase
        loop = AgentLoop(provider, tools, config, fresh_session)
        for event in loop.run(f"/explore {user_goal}"):
            yield event
        return loop.last_knowledge_base

Knowledge Base Transfer

_clear_exploration_state() saves the KnowledgeBase to _last_knowledge_base
The parent accesses it via the last_knowledge_base property
The parent populates its own ExplorationState.knowledge_base from the subagent's results
Phases 2-4 proceed in the parent with a clean context window

Mutation Tracking & Undo

File: rikugan/agent/mutation.py

Every mutating tool call (defn.mutating=True) is recorded in AgentLoop._mutation_log for undo support.

MutationRecord

@dataclass
class MutationRecord:
    tool_name: str              # e.g., "rename_function"
    arguments: Dict[str, Any]   # original arguments
    reverse_tool: str           # tool to call for undo
    reverse_arguments: Dict     # arguments for undo
    timestamp: float
    description: str            # human-readable
    reversible: bool            # False for execute_python, etc.

Reverse Strategies

Tool	Reverse Strategy
`rename_function`	Swap `old_name` ↔ `new_name`
`rename_variable`	Swap `variable_name` ↔ `new_name`
`set_comment`	Restore `old_comment` (from pre-state) or `delete_comment`
`set_function_comment`	Restore `old_comment` or `delete_function_comment`
`rename_data`	Restore `old_name` (from pre-state)
`set_function_prototype`	Restore `old_prototype` (from pre-state)
`retype_variable`	Restore `old_type` (from pre-state)
`execute_python`	Not reversible

/undo Flow

Context Window Management

File: rikugan/agent/context_window.py

class ContextWindowManager:
    max_tokens: int           # from config (default 128000)
    compaction_threshold: 0.8 # compact when usage > 80%

    def should_compact() -> bool
    def compact_messages(messages) -> List[Message]
    def estimate_tokens(text) -> int  # ~3.5 chars/token heuristic

Compaction Strategy

Keep First Message

Summarize Middle

Keep Last 4 Messages

When should_compact() returns True: keep the first message (system/initial), summarize all middle messages into one [Context summary] message, and keep the last 4 messages (recent context).

SessionState._truncate_results() also caps tool results with [...N chars omitted] markers to prevent individual messages from consuming too much context.

Persistent Memory

Files: rikugan/agent/system_prompt.py, rikugan/agent/loop.py

RIKUGAN.md

A per-binary Markdown file stored alongside the IDB/BNDB. It acts as cross-session memory.

Location: <idb_directory>/RIKUGAN.md
Loading: First 200 lines loaded into the system prompt
Writing: Via the save_memory pseudo-tool or plan persistence

save_memory Pseudo-Tool

{"fact": "sub_401230 is the snake initializer, length at +0x1A", "category": "function_purpose"}

Categories: function_purpose, architecture, naming_convention, prior_analysis, general.

Session Management

Files: rikugan/state/session.py, rikugan/state/history.py, rikugan/ui/session_controller_base.py

SessionState

@dataclass
class SessionState:
    id: str                        # unique hex ID
    created_at: float
    messages: List[Message]        # full conversation history
    total_usage: TokenUsage        # cumulative token usage
    last_prompt_tokens: int        # most recent prompt size
    current_turn: int
    is_running: bool
    provider_name: str
    model_name: str
    idb_path: str
    metadata: Dict[str, str]

Multi-Tab & Fork

class SessionControllerBase:
    _sessions: Dict[str, SessionState]  # tab_id -> session
    _active_tab_id: str

    def create_tab() -> str
    def close_tab(tab_id)
    def switch_tab(tab_id)
    def fork_session(source_tab_id) -> Optional[str]  # deep copy

fork_session() creates a deep copy of a session's messages and state into a new tab. The forked session gets metadata["forked_from"] set to the source session ID.

Persistence

Sessions are JSON-serialized to <config_dir>/rikugan/sessions/. Auto-saved after each agent turn if checkpoint_auto_save is enabled. Full round-trip: messages, token usage, tool calls, and tool results are all preserved.

MCP Integration

Files: rikugan/mcp/client.py, rikugan/mcp/bridge.py, rikugan/mcp/manager.py

mcp.json config

MCPManager

MCPClient (per server)

subprocess (stdio)

MCPClient

Communicates with an MCP server subprocess via JSON-RPC 2.0 + Content-Length framing.

Heartbeat: Background thread pings every 30s. Marks _healthy=False on failure
Per-request timeout: Configurable default
Tool discovery: tools/list RPC call at startup

MCPBridge

Converts MCP tool schemas to ToolDefinition objects and registers them in the ToolRegistry with the prefix mcp_<server>_<tool>.

Provider Layer

Files: rikugan/providers/base.py, rikugan/providers/*.py

Anthropic (Claude)

OAuth auto-detection, prompt caching

OpenAI

Standard OpenAI SDK

Gemini

google-genai SDK

Ollama

Local inference

OpenAI-Compatible

Custom API base URL

Prompt Caching (Anthropic)

cache_control: {"type": "ephemeral"} is set on the system prompt, last tool result message, and last user message. This enables 2-10x cost reduction on long conversations.

Retry Logic

for attempt in range(max_retries):
    try:
        yield from stream
        break
    except RateLimitError as e:
        wait = e.retry_after or (2 ** attempt)
        yield TEXT_DELTA(f"Rate limited, retrying in {wait}s...")
        time.sleep(wait)

System Prompt Architecture

Files: rikugan/agent/system_prompt.py, rikugan/agent/prompts/

Shared Prompt Sections

Defined in prompts/base.py:

DISCIPLINE_SECTION — "Do exactly what was asked"
RENAMING_SECTION — Renaming/retyping guidelines
ANALYSIS_SECTION — Analysis approach
SAFETY_SECTION — Safety guidelines
TOKEN_EFFICIENCY_SECTION — Prefer search over listing
CLOSING_SECTION — Final reminders

UI Layer

Files: rikugan/ui/panel_core.py, rikugan/ui/chat_view.py, rikugan/ui/message_widgets.py

Binary Ninja

IDA Pro

Event Polling

A QTimer fires every 50ms, calling _poll_events(). Dequeues up to 20 events per tick and routes each to the appropriate handler.

ChatView Widget Mapping

Event	Widget
`TEXT_DELTA` / `TEXT_DONE`	AssistantMessageWidget (Markdown rendered)
`TOOL_CALL_*`	ToolCallWidget (collapsible, syntax-highlighted)
`TURN_START`	ThinkingWidget (animated dots)
`ERROR`	ErrorMessageWidget
`PLAN_GENERATED`	PlanView (step list with status indicators)
`TOOL_APPROVAL_REQUEST`	ToolApprovalWidget (Allow/Deny)
`EXPLORATION_PHASE_CHANGE`	ExplorationPhaseWidget
`EXPLORATION_FINDING`	ExplorationFindingWidget

Thread Safety Model

Architecture

Main Thread (Qt + IDA)

queue.Queue

Background Thread (Agent)

IDA API Marshalling

IDA Pro requires all API calls on the main thread. The @idasync decorator marshalls calls via ida_kernwin.execute_sync(). Binary Ninja tools run directly — BN's API is thread-safe.

User Answer/Approval Queues

Two queue.Queue(maxsize=1) instances:

_user_answer_queue — For USER_QUESTION responses
_tool_approval_queue — For execute_python approval

The agent waits with queue.get(timeout=0.5) in a loop, checking for cancellation between attempts. The UI thread calls put(). No race condition possible.

Error Handling & Retry

Exception Hierarchy

RikuganError
├── AgentError — loop-level errors
├── CancellationError — user cancelled
├── ProviderError — LLM API errors
    └── RateLimitError — HTTP 429
├── ToolError — tool execution errors
├── ToolValidationError — argument validation
├── MCPError — MCP protocol errors
    ├── MCPConnectionError
    └── MCPTimeoutError
└── SkillError — skill loading errors

Consecutive Error Tracking

_consecutive_errors counts sequential tool failures. After 3 consecutive errors, tools are disabled for the current turn, forcing the LLM to respond with text instead of looping on broken calls.

Logging

File: rikugan/core/logging.py

IDA Output Window

IDAHandler, INFO level
[Rikugan] LEVEL: message

Debug File

FlushFileHandler, DEBUG level
flushed + fsynced after every write

Structured JSON

JSONFormatter, INFO level
Append-mode JSONL, machine-parseable

JSON Log Format

{"ts": 1709500000.123, "level": "INFO", "thread": "Thread-1", "msg": "Subagent started"}

Commands Reference

Command	Description
`/plan <msg>`	Enter plan mode: generate plan, then execute step-by-step
`/modify <msg>`	Enter exploration mode: EXPLORE → PLAN → EXECUTE → SAVE
`/explore <msg>`	Enter explore-only mode: autonomous read-only analysis
`/memory`	Show current RIKUGAN.md contents
`/undo [N]`	Undo last N mutations (default 1)
`/mcp`	Show MCP server health status
`/doctor`	Diagnose provider, API key, tools, skills, config issues
`/<skill-slug>`	Activate a skill (e.g., `/malware-analysis`, `/ctf`)

Data Flow Diagrams

Normal Turn

Normal Turn Flow

User "Explain main()"
  │
  ├─→ SessionState.add_message(USER)
  ├─→ build_system_prompt()
  ├─→ provider.chat_stream(messages, tools, system)
  │     ├─→ TEXT_DELTA "The main function..."
  │     ├─→ TOOL_CALL_START "decompile_function"
  │     ├─→ TOOL_CALL_DONE
  │     └─→ USAGE_UPDATE
  ├─→ ToolRegistry.execute("decompile_function", {name: "main"})
  │     └─→ TOOL_RESULT "int main() { ... }"
  ├─→ SessionState.add_message(TOOL)
  ├─→ provider.chat_stream(messages + tool_result)
  │     ├─→ TEXT_DELTA "This function initializes..."
  │     └─→ TEXT_DONE
  └─→ TURN_END

Exploration Mode (`/modify`)

Exploration Flow

User "/modify Change score from 100 to 999"
  │
  ├─→ Phase 1: EXPLORE (subagent)
  │     ├─→ SubagentRunner.run_exploration()
  │     │     ├─→ [subagent uses tools, logs findings]
  │     │     ├─→ exploration_report → KnowledgeBase
  │     │     └─→ phase_transition("plan") → KnowledgeBase returned
  │     └─→ Parent receives KnowledgeBase summary (~1-2KB)
  │
  ├─→ Phase 2: PLAN
  │     ├─→ PLAN_SYNTHESIS_PROMPT + KB summary → LLM
  │     ├─→ Parse plan → ModificationPlan
  │     └─→ User approves plan
  │
  ├─→ Phase 3: EXECUTE
  │     ├─→ For each PlannedChange:
  │     │     ├─→ EXECUTE_STEP_PROMPT → LLM
  │     │     ├─→ Smart patch skill activated
  │     │     ├─→ execute_python (with approval) → patch bytes
  │     │     ├─→ redecompile_function → verify
  │     │     └─→ exploration_report(category="patch_result") → PatchRecord
  │     └─→ All patches applied
  │
  └─→ Phase 4: SAVE
        ├─→ SAVE_APPROVAL_REQUEST → User
        ├─→ "Save All" → write to file → SAVE_COMPLETED
        └─→ "Discard All" → restore original bytes → SAVE_DISCARDED

Mutation Tracking

Mutation & Undo Flow

LLM calls rename_function(old="sub_401000", new="main")
  │
  ├─→ capture_pre_state() → {} (no pre-state needed for renames)
  ├─→ ToolRegistry.execute("rename_function", {...})
  ├─→ build_reverse_record() → MutationRecord(
  │     reverse_tool="rename_function",
  │     reverse_args={"old_name": "main", "new_name": "sub_401000"})
  ├─→ _mutation_log.append(record)
  └─→ MUTATION_RECORDED event → UI (MutationLogPanel)

User "/undo"
  │
  ├─→ Pop last MutationRecord from _mutation_log
  ├─→ ToolRegistry.execute("rename_function",
  │     {"old_name": "main", "new_name": "sub_401000"})
  └─→ TEXT_DONE "Undone: Rename function main → sub_401000"