Claude Code is an agentic harness: the model reasons through a gather → act → verify loop while tools read, edit, run, and search your project. Context starts filling before your first prompt — CLAUDE.md, auto memory, MCP tool names, and skill descriptions all load at startup. File reads dominate growth during the loop. Compaction, /clear, and subagents are the official levers for reclaiming space and controlling what survives.
Definition
Agentic harness — the runtime around a language model that provides tools, context management, and an execution environment. The model plans; the harness acts.
Agentic loop — the three phases Claude Code cycles through on every task: gather context, take action, verify results. Phases blend together; a bug fix may loop through all three dozens of times.
Startup tax — tokens loaded into the context window before you type your first message. You pay this on every new session.
The loop and built-in tools
When you give Claude a task, it works through gather → act → verify. A question about your codebase might only need gathering. A refactor cycles through all three repeatedly. Each tool result feeds the next decision — that feedback is the loop.
Without tools, Claude can only reply with text. With them, it can act. Anthropic groups the built-in tools into five categories:
| Category | What Claude can do |
|---|---|
| File operations | Read files, edit code, create files, rename and reorganize |
| Search | Find files by pattern, search content with regex, explore codebases |
| Execution | Run shell commands, start servers, run tests, use git |
| Web | Search the web, fetch documentation, look up error messages |
| Code intelligence | See type errors after edits, jump to definitions, find references |
Extensions layer on top without replacing the loop: skills package workflows, MCP connects external services, hooks automate tool events, and subagents delegate work to separate windows.
What loads before you type
A lot enters context before your first prompt. Your setup may vary, but Anthropic’s documentation names these startup surfaces:
| Surface | When loaded | Notes |
|---|---|---|
| System prompt | Session start | Core behavior, tool-use instructions, response formatting |
Project-root CLAUDE.md |
Session start | Re-injected from disk after compaction |
Auto memory (MEMORY.md) |
Session start | First 200 lines or 25KB, whichever is smaller |
| MCP tool names | Session start | Full schemas deferred by default; only names load until a tool is used |
| Skill descriptions | Session start | One-line index only; bodies load on invocation |
Output style / --append-system-prompt |
Session start | Both enter the system prompt |
The startup tax is real and continuous — CLAUDE.md competes with your task on every turn. Keep it lean and universal; see keep CLAUDE.md to universal instructions and why agents ignore your CLAUDE.md.
For skills you invoke manually, set disable-model-invocation: true so descriptions stay out of the startup index until you run /skill-name. Defer unused MCP servers the same way — only connected servers cost names in context.
How context grows during the loop
Once the loop runs, file reads dominate. Each read stays in conversation history for every subsequent turn. A single large file keeps costing tokens for the rest of the session.
Other growth paths:
- Path-scoped rules — rules in
.claude/rules/with apaths:frontmatter load automatically when Claude reads a matching file. You see a one-line “Loaded …” notice in the terminal; the rule content enters context silently. - Nested
CLAUDE.md— child-directory instructions load when Claude reads a file in that subtree. - PostToolUse hooks — hooks can inject
additionalContextvia JSON output. That text enters Claude’s context; plain stdout on exit 0 does not.
Be specific in prompts (“fix the bug in auth.ts”) so Claude reads fewer files. For research-heavy work, delegate to a subagent instead of letting reads accumulate in the main window.
Compaction survival map
As context fills, Claude Code clears older tool outputs first, then summarizes the conversation if needed. Your requests and key code snippets are preserved; detailed instructions from early in the conversation may be lost. Put persistent rules in CLAUDE.md, not in chat history.
What survives depends on how each mechanism was loaded:
| Mechanism | After /compact |
|---|---|
| System prompt / output style | Unchanged — not part of message history |
Project-root CLAUDE.md, auto memory |
Re-injected from disk |
Rules with paths: / nested CLAUDE.md |
Lost until a matching file is read again |
| Skill descriptions (startup index) | Not re-injected |
| Invoked skill bodies | Re-injected — 5K tokens per skill, 25K total; oldest dropped first |
Path-scoped rules and nested CLAUDE.md files load into message history when triggered, so compaction summarizes them away. They reload the next time Claude reads a matching file. If a rule must persist across compaction, move it to project-root CLAUDE.md or drop the paths: frontmatter.
Steer what the summary keeps with a Compact Instructions section in CLAUDE.md:
## Compact Instructions
When compacting, preserve: current task goal, files being edited, test commands, and open decisions.
Drop: exploratory reads, superseded plans, verbose tool output.
Or steer a single compaction from the prompt:
/compact focus on the auth flow and files under src/auth/
For tactics on when to compact proactively and when to reset entirely, see context engineering beats a bigger window.
Subagents as context isolation
Subagents run in a separate context window — completely isolated from your main conversation. They load their own system prompt, CLAUDE.md, and MCP/skill setup, but not your conversation history or the main session’s auto memory. When done, only a summary and a small metadata trailer return to the parent.
The heavy intermediate tokens — large file reads, MCP documentation, search output — never enter your window. That is the core move of subagent context isolation.
Reach for a subagent when a task needs many files or a large doc read, but the main loop only needs the conclusion. For workflow guidance on delegation and tier selection, see the context-engineering series article linked above.
Inspect and reset
Two commands show live state:
/context # breakdown by category with optimization suggestions
/memory # which CLAUDE.md and auto-memory files loaded at startup
Run /clear between unrelated tasks to reset to zero tokens instead of dragging old context along.
Flagship models in Claude Code — Fable 5, Sonnet 5, Opus 4.6 and later, and Sonnet 4.6 — support a 1M-token context window. Compaction works the same at the larger limit; a bigger ceiling does not remove the need to engineer what enters the window. See the sibling article for paid-plan limits and credit gates.
Bottom line
Understand three things: the loop that drives every session, the startup tax you pay before typing, and the survival map that tells you what compaction keeps. Spend context deliberately — that is context engineering, and it matters more than window size.