Context Engineering
Definition
Context engineering is the emerging discipline of structuring information, instructions, and tool access to maximize AI agent performance on domain-specific tasks. It encompasses what to include in prompts, how to decompose tasks across agents and subagents, what data to retrieve, how much compute to allocate, and how to evaluate results.
Levie frames the challenge: “To build AI agents, in theory, it should be as simple as having a super powerful model, giving it a set of tools, having a really good system prompt, and giving it access to data. But in practice, to make agents that work today, you’re dealing with a delicate balance.”
Why Context Engineering Is Hard
AI agents have a fundamental constraint that human workers do not: the context window. They can only process a fixed amount of information at once, and performance degrades as that window fills (“context rot”). This creates a series of non-obvious engineering tradeoffs:
| Dimension | Tradeoff |
|---|---|
| Global vs. subagent scope | What belongs in the master agent’s context vs. delegated to specialized subagents? |
| Agentic vs. deterministic | Which steps need AI reasoning vs. a hard-coded tool call? |
| Context window budget | How much of the limited window goes to instructions, data, history, and tools? |
| Speed vs. quality | Fast inference may sacrifice quality; thorough reasoning may be too slow |
| Breadth vs. depth of retrieval | More retrieved context helps coverage but risks overwhelming the model |
Levie observes: “So far there’s no one right answer for any of this, and there are meaningful tradeoffs for any given approach you take.”
Domain Expertise as the Bottleneck
Getting context engineering right “requires a deep understanding of the domain you’re solving the problem for. Handling this problem in AI coding is different from law, which is different from healthcare.”
Context engineering is not purely a technical discipline. It requires someone who understands the domain workflow (every step a human expert would take), the information architecture (what data exists, where, and how to retrieve the right subset), and the failure modes (what goes wrong when the agent has too little context, too much, or the wrong kind).
Levie argues that AI agent product managers need a different skill set than traditional PMs: “Deeply understand the domain that you’re building agents for. In an ideal world you actually have worked in that space.”
The Subagent Paradigm
One of the key architectural patterns in context engineering is decomposition into subagents. Levie describes this as “super interesting” — rather than one monolithic agent with a bloated context window, work is distributed across specialized subagents, each with a focused context.
The analogy: “There was probably a time in the past where nearly all software tasks could be handled by a single large application. But as the complexity and breadth of tasks grew, there became need for applications to then be broken up across specialized functions.”
This creates a new coordination problem: if you have “100X more AI agents than people doing work in a company, it’s actually more critical than ever that the work is well orchestrated.” The management challenge shifts from managing humans to orchestrating agent workflows.
Information Retrieval as the Foundation
Levie makes a reductive but clarifying claim: “Many AI agent problems are really just information retrieval problems. If the agent can find the right information for a given task, they’d nail the task.”
This frames context engineering as a variant of the classic knowledge management problem — getting the right information to the right agent at the right time — now operating at machine speed and machine scale.
The Eval Problem
Context engineering produces non-deterministic systems. Shreya Shekhar identifies the core challenge: today’s agentic systems introduce “stochasticity, the action-intent gap, and subjectivity of results.” Traditional software has strict guarantees around consistency and fault tolerance. AI agents offer none of these by default.
Levie confirms this from practice: “We are barely scratching the surface on evals. A significant portion of knowledge work is going to move to AI agents, and we have essentially no infrastructure for understanding how they’re performing at a fine-grained level.”
The eval gap means organizations cannot yet reliably measure whether context engineering improvements actually improve outcomes — creating a build-measure-learn cycle with a broken “measure” step.
Connection to the Knowledge-Judgement Shift
Context engineering operationalizes the shift described in Judgement vs Knowledge in the AI Era. The agent supplies knowledge (retrievable, encodable); the human supplies judgment about which knowledge matters and how to structure the agent’s context. The human’s value shifts from knowing things to knowing how to frame problems for AI consumption.
Related
- Judgement vs Knowledge in the AI Era — Humans provide judgment; context engineering structures knowledge for agents
- The Expanding Work Frontier — Better context engineering unlocks more task categories
- Software Design for the Agent Era — Context engineering shapes how agent-facing software must be built
- Agentic Systems Infrastructure — The systems primitives that context engineering requires