How AI coding agents work and what to remember when using them

This context limitation naturally limits the size of the codebase that LLM can process at any one time. Feeding an AI model a large number of large code files (which must be re-evaluated by the LLM every time you send another response) can quickly exhaust your tokens or usage limits.

business tips

To get around these limitations, coding agent authors use several tricks. For example, AI models are being fine-tuned to write code that outsources activities to other software tools. For example, you might want to write a Python script that extracts data from an image or file rather than feeding the entire file through LLM. This saves tokens and avoids inaccurate results.

Anthropic's documentation states that Claude Code has also used this approach to perform complex data analysis on large databases, create targeted queries, and use Bash commands like “head” and “tail” to analyze large amounts of data without loading the complete data object into the context.

(In a sense, these AI agents are programs that use guided but semi-autonomous tools, a significant extension of the concept we first saw in early 2023.)

Another major advance in agents has come from dynamic context management. Agents can do this in several ways that are not fully disclosed in their own coding models, but we know that the most important technique they use is context compression.

A command line version of the OpenAI codex that runs in a macOS terminal window. — A command-line version of OpenAI Codex that runs in a macOS terminal window.

Credit: Benji Edwards

When the coding LLM approaches the context limit, the technique compresses the context history by summarizing it, reducing the history to important details at the cost of losing process details. Anthropic's documentation explains that this “compression” is about discarding redundant tool output while retaining important details such as architectural decisions and unresolved bugs, and extracting context content in a high-fidelity manner.

This means that the AI coding agents regularly “forget” large portions of what they're doing whenever this compaction occurs, but unlike older LLM-based systems, they're not completely ignorant of what happened, and can quickly change course by reading existing code, notes left in files, changelogs, etc.

Source link