Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124


As models become smarter and more capable, the "harness" around it must also evolve. This "harness engineering" It is an extension of context engineering, says LangChain co-founder and CEO Harrison Chase in a new VentureBeat podcast Beyond the pilot episode. While traditional AI harnesses have tended to prevent models from running in loops and calling tools, harnesses built specifically for AI agents allow them to interact more independently and perform long-running tasks effectively.
chase too participated in the acquisition of OpenClaw by OpenAIarguing that its viral success came down to the will of "let it break" in a way that no major lab would, and questioning whether the acquisition really brings OpenAI closer to a secure enterprise version of the product. “The trend in harnesses is to give the large language model (LLM) more control over context engineering, allowing it to decide what it sees and what it doesn’t see,” Chase says. “Now, this idea of a more autonomous, long-lasting assistant is viable.”
While the concept of allowing LLMs to run in a loop and call tools seems relatively simple, it is difficult to achieve reliably, Chase noted. For a time, models were “below the usefulness threshold” and simply couldn’t run in a loop, so developers used graphs and writing strings to get around that problem. Chase pointed to AutoGPT, once the fastest-growing GitHub project in history, as a cautionary example: the same architecture as today’s top brokers, but the models still weren’t good enough to run reliably in a loop, so it quickly faded away. But as LLMs continue to improve, teams can build environments where models can run in loops and plan for longer horizons, and they can continually improve these harnesses. Previously, “you couldn’t really make improvements to the harness because you couldn’t actually run the model with a harness,” Chase said. LangChain’s answer to this is Deep Agents, a customizable general purpose harness. Built on top of LangChain and LangGraph, it has scheduling capabilities, a virtual file system, context and token management, code execution, and skills and memory functions. Additionally, you can delegate tasks to subagents; These are specialized with different tools and configurations and can work in parallel. The context is also isolated, meaning that the subagent’s work does not saturate the main agent’s context and the context of large subtasks is compressed into a single output for token efficiency. All of these agents have access to file systems, Chase explained, and can essentially create to-do lists that they can execute and track over time. “When you move on to the next step, and you move on to step two, three or four of a 200-step process, you have a way of tracking your progress and maintaining that consistency,” Chase said. “Basically, it comes down to letting the LLM write down their thoughts as they go.” He emphasized that harnesses should be designed so that models can maintain consistency over longer tasks and be “tame” so that models decide when to compact context at points they determine are “advantageous.” Additionally, giving agents access to code interpreters and BASH tools increases flexibility. And providing agents with skills rather than just tools loaded from the start allows them to load information when they need it. “So instead of encoding everything into one big system message," Chase explained, "you could have a smaller system message, “This is the core basis, but if I need to do X, let me read the skill for X. If I need to do Y, let me read the skill for Y.”"
Basically, context engineering is a “really fancy” way of saying: What is the LLM looking at? Because that’s different from what developers see, he noted. When human developers can analyze agent traces, they can put themselves in the “mindset” of the AI and answer questions like: What is the message of the system? How is it created? Is it static or populated? What tools does the agent have? When you make a call to a tool and get a response, how is it presented? “When officers make mistakes, they do it because they don’t have the right context; when they succeed, they do it because they have the right context,” Chase said. “I think context engineering is getting the right information in the right format into the LLM at the right time.” Listen to the podcast to learn more about:
How LangChain built its stack: LangGraph as the core pillar, LangChain in the middle, Deep Agents on top.
Why coding sandboxes will be the next big thing.
How a different type of UX will evolve as agents run at longer intervals (or continuously).
Why traces and observability are essential to creating an agent that really works.
You can also listen and subscribe Beyond the pilot in Spotify, Apple or wherever you get your podcasts.