AI Agents Will Never Have Full Context

The most entertaining outcome of this AI revolution is best described in a cheeky tweet from Joe Weisenthal, a popular journalist and podcast host for Bloomberg.

I know Joe was only half serious in this specific tweet, but he is not the only voice out there pushing the narrative that AI will automate software engineers out of a job. In a recent Dwarkesh Podcast with Dario Amodei, CEO of Anthropic, Dwarkesh claimed that AI agents have an inherent advantage in coding versus other domains due to the amount of context available in codebases. While I am a huge fan of Joe and Dwarkesh, anyone who has been involved in a software engineering project that went live will tell you that full context WILL NEVER be easily accessible to an AI agent.

What does “full context” mean?

“Full context” means that the AI agent can implement the described solution based on the set of information and tools at its disposal. Every decision that the agent makes during the end-to-end build must have a clear, correct answer that matches the users’ intentions and defined criteria.

ChatGPT Without Full Context

While most agentic AI systems are way more complex than your average ChatGPT interaction, we can start with ChatGPT as a simple example.

Enter the following text into ChatGPT (or a similar tool): “My stove is on fire.”

Here is a summary of ChatGPT’s response*:

  1. Turn off the heat
  2. Smother the fire
  3. Do not use water
  4. Leave it covered
  5. If the fire is spreading or you feel unsafe, get out immediately and call 911.

While ChatGPT’s response seems reasonable, it did not have any context on the following key items:

  • Was anyone else in my building?
  • What is the average response time for emergency services to arrive to my building?
  • Are fires prone to spreading quickly in my environment (e.g., California)?
  • Am I disabled and in need of assistance to vacate the building?
  • Where is the nearest fire extinguisher or fire blanket?
  • What is the most efficient way to alert everyone in my building that there is a fire?
  • Was 911 the correct emergency response phone number?
  • Was I using a VPN that obscures my current location?

ChatGPT did not automatically have access to correctly answer any of the questions above. It immediately made an assumption that I was living in the USA and that my intention was to receive help because I am in danger. It proceeded to give me the most likely correct information based on the incomplete context it received, my historical chat history, the data the model was trained on, and maybe my current location. While ChatGPT failed to understand the full context in the fire scenario, let’s see how an AI agent would handle a moderately complex software engineering effort.

*The full response it gave me is here

Software Engineering Without Full Context

On a recent data engineering project, we were tasked with automating the calculation of 10 supply chain metrics to provide oversight and track the efficiency of a supply chain process. At the start of the project, we were given a lengthy document to describe the business logic required to calculate the metrics, a data dictionary to describe some of the data attributes involved, and a business process model to describe where humans will interact with this process. If a business user fed this information into an AI agent and gave the agent access to every data source required, it would have built a fully functional data engineering pipeline. The only problem is that the metrics it calculated would have been garbage.

The context we received at the beginning of the project did not reveal that some of the SQL views that fed our pipeline were built incorrectly. None of the data sources had data model documentation and the data dictionaries we received were out-of-date. Most of the key decisions we made during the development of the pipeline were dependent on other teams’ timelines which were only available by talking to the people involved. The file types of the data sources that fed our pipeline changed throughout the project. Some of the data sources had information that changed over time, but the SQL database only stored the latest information.

Almost every blocker we encountered on this project had nothing to do with our ability to write code. Acquiring the necessary context to decide which code to write was far more challenging because information is always biased, incomplete, and/or inaccessible.

Conclusion

AI agents cannot read minds. They cannot understand an inside joke. They do not know which of the 10 different “product name” columns is the correct one to use on your project. The nonverbal, omitted, or inaccessible information is often more important than what is available to an AI agent in the digital world. Closing the distance between the surface-level brief and the full context needed to ship code is less like stepping over a crack in the sidewalk and more like staring across the vast chasm of the Grand Canyon.

~ The Data Generalist
Data Science Career Advisor


Other Recent Posts

Leave a Reply