What are Agentic AI systems?
These are:“autonomous software systems that perceive, reason, and act in digital environments to achieve goals on behalf of human principals, with capabilities for tool use, economic transactions, and strategic interaction.” AI agents can employ standard building blocks, such as APIs, to communicate with other agents and humans, receive and send money, and access and interact with the internet” (as defined in an MIT article). With improvements in robotic systems using a similar technology stack, Agentic systems are envisioned for powering various use-cases as software, hardware and hybrid systems such as robotic systems and autonomous vehicles.
Agentic AI, Agent Orchestration and Agents are lynchpin areas of innovation building upon the adoption of basic LLMs as “purveyors” of intelligence. Agentic AI as a systems engineering approach aims to deliver a reliable working framework for building autonomous systems. In this brief essay, I put this “new” framework in the context of the history of building AI systems since the late 1950’s.
AI as a field has always aimed to build “autonomous systems” – as standalone software or embodied hardware entities. In the early years, the focus was on understanding various aspects of the “mind”, “intelligence”, “reasoning” and building computational models for the same. Two main themes emerged – namely – symbolic AI and connectionist AI – each having a strong area of application. Symbolic AI focused on “reasoning” whereas connectionist AI outpeformed in “perception-driven” AI applications. It is important to note that in-between these two – there were whole range of “models” for learning – primiarily driven by statistical reasoning models leading to advances and applications of data science. The advances in vision/perception processing with neural nets led to many improved applications in robotics incuding autonomous driving agents. By the late 90s, early 2000s with advances in computing power and the growth of the internet – the field of distributed agents – had its first avatar. Building agents was still a cumbersome engineering activity, distributed software technologies were still evolving – early days of JAVA and big data. The software technology for building large-scale data driven systems was still in its early days. Many fundamental lessons in building agent systems were learnt, which are still relevant today and I discuss them below.
Early 2000s and 2010s, with advances in deep learning and improvements in compute and big data – statistical and connectionist approaches performed well on many basic problems such as NLP and vision. As independent “skills” of an agent improved and LLM technologies performed adequately for “perception processing” – namely – language, vision and speech – it became possible to build standalone agents with reasonable performance- which is where we are today. Agentic AI system technology is in its second avatar. Few points about terminologies need to be understood in the current avatar – a “Single agent” versus “Multi-agent” – many notions exist – just using a single LLM with a single workflow- is a single agent whereas a single LLM with multiple workflows is a multi-agent. The extant literature is unclear on what do you call a single workflow calling multiple LLMs – each LLM capturing different types of expertise. This is quite different from the traditional notion of single agent versus multiple – which depends on how “knowledge and work” get decomposed. Many of these terminologies are still evolving.
What has changed in the current iteration of AI technology?
Deep Learning/LLM based AI systems of today have brought a few essential changes to realize the vision of agentic AI systems –
1) Scalable processing of “perceptual inputs” and generation of reallife like outputs – for language processing, speech processing and vision processing. This scales with data on the modern GPU-driven tech stack.
2) The ability to process “perceptual signals” aims to faciliate “adaptivity” in an easier manner – adaptation to changes in the agents environment – changes could be in the physical or digital environment and also adaptation to changes in the agents interaction with other agents in its context.
3) This enhanced “perceptual” adaptivity promotes the development of “protocol flexibility” and natural language expressiveness when agents communicate with humans in-the-loop.
4) A faster data-driven learning loop along with a continual learning loop aims to make this adaptivity process near real time.
The belief is that the above reasons along with the recent efforts to integrate LLMs with “Symbolic reasoning” systems (via Reinforcement learning) of different kinds will facilitate a successful implementation of a highly autonomous agent in this second avatar. The learning-driven architecture overcomes some of the engineering issues experienced in the first avatar. However, a few major hurdles still remain to be addressed even in this avatar which we further discuss below after a brief overview on why we want to build agentic systems.
Why do we need Agentic Systems?
Primary reasons for building agentic systems include:
1. Expertise in different domains is distributed across multiple areas of knowledge. As each area of knowledge evolves at different rates and rather independently, multi-agent systems model this scenario adequately. This also facilitates distributed learning if the knowledge and learning events are distributed in space and time.
2. Even if expertise can be encapsulated in a single agent – scaling to support demand and improve throughput of any task – requires the ability to execute multiple instances of the same agent in parallel and concurrently.
3. Finally, multi-agent systems aim to improve resource utilization by orders of magnitude.
On Implementing Agentic Systems
Building agentic systems relies on two key layers –
1. The infra and systems layer – Distributed systems engineering including – distributing compute and storage in a network, scaling and messaging – communications and protocols. Compute includes sensor processing where external/perceptual signals are processed including audio and video signals and also “actuator” control to interact with the environment.
2. The knowledge layer -addresses the following – How is knowledge distributed across the network? Which nodes carry what knowledge? How is “work” carried out in an order? How is the order defined or planned? How are decisions taken to react to internal and external events/stimulii? What are the “semantics” of the messages between the knowledge subsystems in the network? How is failure handled at the knowledge level? How is this knowledge updated – continually? periodically? How are multiple streams of work coordinated simultaneously? (for example – processing visual and language input concurrently). Design decisions here suggest if a system is a single agent or multi-agent.
Engineering an agentic systems involves making architectural choices in both layers to address the problem at hand in a given domain. Distributed systems engineering issues are well understood such as the CAP theorem, the herd problem, network protocol and layer issues, clock, timing and communication speed issues, storage mirroring, replication and update issues (such as ensuring ACID properties). However design issues at the knowledge layer remain to be figured out. Many of these were identified by the multi-agent systems research community in the early 2000s. Knowledge layer architectural problems to be solved include:
1. How should the domain knowledge be distributed? Which are the central nodes and which are the edge nodes? What “semantic” knowledge goes in each? How does and what knowledge propagate from the edge to the core and vice-versa? How procedural and declarative knowledge is to be modelled in a domain (computationally) and linked to which node or set of nodes? The architecture of the knowledge decomposition and responsibility of processing with that knowledge defines a single agent or multi-agent. A single agent does one “knowledge thingy” – though it may be implemented in a distributed manner in the systems layer. A multi-agent system has “single agents”, each with different areas of expertise who “collaborate/coordinate” to achieve an objective. For clarity, a “Search agent” (searching for missing people) is a single agent whereas a “Search and Rescue” system is a multi-agent system. The search agent also may run multiple “concurrent queries” to different databases, update multiple records etc. but it is response for one overall “locus of control”. In the search and rescue system, Along with the search agent, the rescue agent brings a different level of expertise – how to access the missing individual once located and identified, what resources to get access and rescue the individual etc. Two independent threads of control execute the whole search-and-rescue mission.
2. In a domain, there many be multiple problems to be solved – How do we decompose the problem into sub-steps and organize knowledge to solve them? What are the common skills across different problem types? How should we decompose and distribute different chunks of expertise? What are “common” problem solving metaphors to use? How should “static” problems – such as analyzing history for providing explanations be solved? How should “dynamic” problems that require agent adaptivity be solved? How does an agent recover from failures? Failures at a knowledge level or failures in the underlying systems level?
3. How should “past” knowledge be retrieved and re-used efficiently in any “knowledge node”? What policies should be used to update internal knowledge ? How should we process external knowlegde? What should be kept? What should be processed? What should be discarded? How should internal memory be maintained? How should the agent forget things?
4. How do we link/integrate the perceptual and symbolic systems to facilitate seamless reasoning? How do we handle open-ended issues? How do we ensure that agent reasoning and actions are well-bounded in time and space – so that an either completes a task or fails explicitly?
5. What are the performance requirements on each knowledge node that allows them to interplay with each other?
6. How is adaptivity modelled – what do we recognize in the environment, interpret and reconcile with the internal model of the agent? What actions are recommended based on this internal model? How many samples do I need to learn successfully (this varies by problem), How do I know learning has succeeded versus failed? (I think I know but I don’t know versus I am not even aware I do not know?) Do I need new abstractions to learn? How does the agent know I need a new abstraction? How does an agent know I have learnt the wrong thing? How do I delete and forget the wrong learning?
7. How do we handle and model issues of goals, beliefs, desires and intention in an agent? How do we address issues of bias, ethics and model higher centers of human cognition in agents? Do we even need these? Do we need kill switches? how would it work? If the agent were “robotic”, we need to address issues of power supply, failure mode detection and recovering and more.
8. Once single agent knowledge structures are defined, we need to design the “collaboration/co-ordination” process – Do agents work on a common “output” entity? shaping the solution or does one agent “assemble” a final solution based on inputs from independent agents? What global knowledge and local knowledge does an agent need to make its contibution? What happens when a specific knowledge agent fails? How does problem solving progress? How do we resolve conflicts – between agents?
9. Interaction between the agent collective and human experts also need to be designed – what should be the inputs and outputs from agents to humans? How should it be presented? How should human inputs be “interpreted”? What if some human inputs are erroneous? What if the agent is deliberately attacked?
10. Testing, evaluation and validation of agent collectives is a complex task. How do we define test cases? Identify all failure modes and perform evals? How do we calibrate agents? What is the requisite production deployment readiness criteria in mission-critical scenarios? What are the performance metrics? What happens if things fail in production? How do we recover? How do we secure the agentic system? What are the “Self-protection” and self-healing approaches?
11. Finally, a big area to design for is knowledge and system updates at both layers of the system – How do we discover new knowledge and adapt? How is relevance determined? How does the agent know that adapting to new knowledge will not mess it up? (like a bacterial invasion into a human body – can it do self-recovery?) How can the systems layer be modified incrementally to assimilate new technology and subsystems – without loss in performance or a full redo?
Most online discussions on agent system conflate the design decisions in these two layers leading to much unwanted confusion. In a wide variety of agent systems, most of the decisions in both layers are made by software engineers. They are pre-configured and cannot be changed at runtime by the agent. The “level” of autonomy is highly variable. Measuring Autonomy needs a performance metric. This is yet to be defined in the community. As things stand, one can safely say – current agentic systems are at the lower end of the metric scale. The design space for building agentic systems is rather large with different design choices leading to differing operational performance. A simple analogy is to look at Mother Nature and the number of “living” species across the environmental spectrum – niche, common, diverse, specialized entities at scale.
Coordination of “agents” at a simpler level of abstraction ihas been demonstrated in workflow systems that have been built over the past couple of days. Collaboration systems have been demonstrated in modern group systems tools such as meeting tools, doc editing tools, sharing and messaging systems etc. Current agentic orchestration platforms are at the same level of capability – albeit using a different toolset. From a knowledge modeling perspective – it is slowly being realized that connectionist learning has its own limitations. Activating “years” of human learning in an autonomous system in any given domain may require integration with human-centric symbolic systems (Token “xyz” makes no sense – different LLMs need their own embeddings so they cannot even communicate amongst themselves! We need some intermediary language to solve for agent babel like “PNG” – portable network graphics format) Current systems without humans in the loop are “symbol-weak”. Overall, the engineering maturity to build and deploy agentic systems is still developing. Overall development and production costs are rather high in contrast to a human-only system in many tasks! These autonomous systems are also extremely complex to maintain and carry large societal costs though they promise numerous economic benefits which may take a while to realize. We do not know all the side effects nor can imagine all the cons of building and deploying such systems at scale. The road ahead looks interesting with its own perils and promises!
Concluding Remarks
There is no free lunch. Agentic systems made up of LLM-driven agents is a logical step in our collective effort to build autonomous systems. As this essay highlights, many core problems that were identified in the past still remain to be solved in a principled manner. Terminologies are still very fluid. Issues in distributed systems development and how to “organize” and engineer intelligence is still an open problem. Many of these are being re-discovered again by a newer generation of builders. We believe revisiting a bit of history and how these issues were addressed or dealt with may be beneficial for building modern Agentic systems. Imagining Agentic AI systems is easy – inspired by how a collection of humans would solve any given problem – Realizing that collective in terms of modern technology is a tall order!
