Context engineering is the practice of curating what enters a model’s context window each turn — instead of relying on a larger window to absorb clutter. Input quality drives output quality; a bigger window does not.
Why it matters
Every turn in an agent loop resends the full state: prior messages, model responses, tool results, file reads, and fetched docs — not just the latest prompt. Tokens accumulate, and answer quality degrades before the hard limit (see lost in the middle).
The moves
- Lean instructions — keep
CLAUDE.mdto universal facts only (see keep CLAUDE.md to universal instructions). /compact— summarize the conversation; append an instruction for what to preserve./clear— reset to zero tokens when switching to unrelated work.- Plan outside, paste in — do exploratory back-and-forth elsewhere; inject only the final plan.
- Subagents — push heavy reads into isolated context and return a summary (see subagent context isolation).