Skip to content
AGENTIC_ENG
Claude

How Claude Code's agentic loop fills your context window

AUTHOR:
Bartłomiej Krupa
PUBLISHED:
2026.06.30
EST_READ:
8 min

Claude Code is an agentic harness: the model reasons through a gather → act → verify loop while tools read, edit, run, and search your project. Context starts filling before your first prompt — CLAUDE.md, auto memory, MCP tool names, and skill descriptions all load at startup. File reads dominate growth during the loop. Compaction, /clear, and subagents are the official levers for reclaiming space and controlling what survives.

Definition

Agentic harness — the runtime around a language model that provides tools, context management, and an execution environment. The model plans; the harness acts.

Agentic loop — the three phases Claude Code cycles through on every task: gather context, take action, verify results. Phases blend together; a bug fix may loop through all three dozens of times.

Startup tax — tokens loaded into the context window before you type your first message. You pay this on every new session.

The loop and built-in tools

When you give Claude a task, it works through gather → act → verify. A question about your codebase might only need gathering. A refactor cycles through all three repeatedly. Each tool result feeds the next decision — that feedback is the loop.

Without tools, Claude can only reply with text. With them, it can act. Anthropic groups the built-in tools into five categories:

Category What Claude can do
File operations Read files, edit code, create files, rename and reorganize
Search Find files by pattern, search content with regex, explore codebases
Execution Run shell commands, start servers, run tests, use git
Web Search the web, fetch documentation, look up error messages
Code intelligence See type errors after edits, jump to definitions, find references

Extensions layer on top without replacing the loop: skills package workflows, MCP connects external services, hooks automate tool events, and subagents delegate work to separate windows.

What loads before you type

A lot enters context before your first prompt. Your setup may vary, but Anthropic’s documentation names these startup surfaces:

Surface When loaded Notes
System prompt Session start Core behavior, tool-use instructions, response formatting
Project-root CLAUDE.md Session start Re-injected from disk after compaction
Auto memory (MEMORY.md) Session start First 200 lines or 25KB, whichever is smaller
MCP tool names Session start Full schemas deferred by default; only names load until a tool is used
Skill descriptions Session start One-line index only; bodies load on invocation
Output style / --append-system-prompt Session start Both enter the system prompt

The startup tax is real and continuous — CLAUDE.md competes with your task on every turn. Keep it lean and universal; see keep CLAUDE.md to universal instructions and why agents ignore your CLAUDE.md.

For skills you invoke manually, set disable-model-invocation: true so descriptions stay out of the startup index until you run /skill-name. Defer unused MCP servers the same way — only connected servers cost names in context.

How context grows during the loop

Once the loop runs, file reads dominate. Each read stays in conversation history for every subsequent turn. A single large file keeps costing tokens for the rest of the session.

Other growth paths:

  • Path-scoped rules — rules in .claude/rules/ with a paths: frontmatter load automatically when Claude reads a matching file. You see a one-line “Loaded …” notice in the terminal; the rule content enters context silently.
  • Nested CLAUDE.md — child-directory instructions load when Claude reads a file in that subtree.
  • PostToolUse hooks — hooks can inject additionalContext via JSON output. That text enters Claude’s context; plain stdout on exit 0 does not.

Be specific in prompts (“fix the bug in auth.ts”) so Claude reads fewer files. For research-heavy work, delegate to a subagent instead of letting reads accumulate in the main window.

Compaction survival map

As context fills, Claude Code clears older tool outputs first, then summarizes the conversation if needed. Your requests and key code snippets are preserved; detailed instructions from early in the conversation may be lost. Put persistent rules in CLAUDE.md, not in chat history.

What survives depends on how each mechanism was loaded:

Mechanism After /compact
System prompt / output style Unchanged — not part of message history
Project-root CLAUDE.md, auto memory Re-injected from disk
Rules with paths: / nested CLAUDE.md Lost until a matching file is read again
Skill descriptions (startup index) Not re-injected
Invoked skill bodies Re-injected — 5K tokens per skill, 25K total; oldest dropped first

Path-scoped rules and nested CLAUDE.md files load into message history when triggered, so compaction summarizes them away. They reload the next time Claude reads a matching file. If a rule must persist across compaction, move it to project-root CLAUDE.md or drop the paths: frontmatter.

Steer what the summary keeps with a Compact Instructions section in CLAUDE.md:

## Compact Instructions
When compacting, preserve: current task goal, files being edited, test commands, and open decisions.
Drop: exploratory reads, superseded plans, verbose tool output.

Or steer a single compaction from the prompt:

/compact focus on the auth flow and files under src/auth/

For tactics on when to compact proactively and when to reset entirely, see context engineering beats a bigger window.

Subagents as context isolation

Subagents run in a separate context window — completely isolated from your main conversation. They load their own system prompt, CLAUDE.md, and MCP/skill setup, but not your conversation history or the main session’s auto memory. When done, only a summary and a small metadata trailer return to the parent.

The heavy intermediate tokens — large file reads, MCP documentation, search output — never enter your window. That is the core move of subagent context isolation.

Reach for a subagent when a task needs many files or a large doc read, but the main loop only needs the conclusion. For workflow guidance on delegation and tier selection, see the context-engineering series article linked above.

Inspect and reset

Two commands show live state:

/context   # breakdown by category with optimization suggestions
/memory    # which CLAUDE.md and auto-memory files loaded at startup

Run /clear between unrelated tasks to reset to zero tokens instead of dragging old context along.

Flagship models in Claude Code — Fable 5, Sonnet 5, Opus 4.6 and later, and Sonnet 4.6 — support a 1M-token context window. Compaction works the same at the larger limit; a bigger ceiling does not remove the need to engineer what enters the window. See the sibling article for paid-plan limits and credit gates.

Bottom line

Understand three things: the loop that drives every session, the startup tax you pay before typing, and the survival map that tells you what compaction keeps. Spend context deliberately — that is context engineering, and it matters more than window size.