January 31, 2025
In the last article, we looked at Retrieval Augmented Generation (RAG), a system that helps an LLM provide accurate answers based on a given context. Today, we're looking at AI agents. They take the answer and actually do something with it, like making decisions, executing tasks, and coordinating multiple steps to achieve a goal.
With AI agents, we can use LLMs to accomplish end-to-end tasks. A RAG-enhanced LLM can help answer questions by pulling relevant info. However, an AI agent can process the information end-to-end by analyzing the documentation and even coordinating with other agents or systems when needed.
The concept of agents, where a software program can accept input and take actions based on rules, has existed for years. However, with AI agents, we're seeing something completely new—the ability to understand context without predefined rules, tweak decisions based on context, and learn from each interaction. AI agents are not just bots working within a fixed set of rules; rather, they are systems capable of making complex decisions in real-time.
AI agents are software applications that use LLMs to perform specific tasks autonomously. They're handy for tasks that demand autonomy and complex decision-making. They'll be especially helpful in dynamic environments where interactions could benefit from automation or the workflow involves multiple steps.
In its Mastering AI Agents guide, Galileo.ai highlights the following types of Agents: Fixed Automation, LLM Enhanced, ReAct, ReAct + RAG, Tool Enhanced, Self-Reflecting, Memory Enhanced, Environment Controllers, and Self Learning. Below is a description of each:
1. Fixed-Automation Agent
This type of AI agent represents the simplest and most rigid form. These agents don't think or adapt—they simply execute pre-programmed instructions. They are efficient but inflexible, like assembly-line workers, making them competent for repetitive tasks.
Their best use cases are routine tasks with minimal need for adaptability, such as invoice processing.
2. LLM-Enhanced Agent
These agents leverage LLMs to provide contextual understanding while operating within strict boundaries. LLM-Enhanced Agents combine intelligence and simplicity, making them efficient for high-volume, low-complexity tasks. Since they are rule-constrained, decisions are validated against predefined rules with no long-term memory.
The best use cases for LLM-Enhanced agents are tasks requiring flexibility with ambiguous inputs, such as email filters or AI-enhanced content moderation.
The workflow starts with the agent using LLM capabilities to analyze and understand the input context. Next, its analysis passes through rule-based constraints to keep the agent within defined boundaries and produce an output.
3. ReAct Agent
A ReAct agent combines Reasoning and Action to perform tasks. They break complex tasks into manageable steps using strategic thinking and multi-step decision-making to reason through problems and act dynamically based on their analysis.
These agents mimic human problem-solving by thinking through a problem before executing the next step. They handle multi-step workflows by breaking them into smaller, actionable parts.
The best use cases for ReAct agents are in strategic planning situations and multi-stage queries. Unlike simpler agents, this kind can loop between thinking and acting repeatedly until the desired outcome is achieved before producing its final output. It is like a problem solver that keeps adjusting its approach.
4. ReAct + RAG Agent
These agents combine reasoning and action with real-time access to external data sources. ReAct + RAG agents are ideal for precision-critical tasks. They employ a RAG workflow, combining LLMs with external knowledge sources for enhanced context and accuracy.
They are grounded in real-time or domain-specific knowledge and use reasoning to break down tasks and dynamically retrieve information. Their best use cases are in domain-specific applications or tasks needing real-time updates, such as legal research tools, technical troubleshooting, or medical assistants that reference clinical studies.
ReAct + RAG agents are like problem solvers who think, act, and fact-check against reliable sources.
5. Tool-Enhanced Agent
These agents are versatile problem solvers that leverage APIs and databases and integrate software and multiple tools to handle complex, multi-domain workflows. They can retrieve, reason, and execute to achieve seamless, dynamic task completion.
After an initial reasoning phase, the Tool-Enhanced agent selects the appropriate tool for the task. They can automate repetitive processes by integrating diverse tools. Their best use cases are jobs requiring diverse, complex, or multi-stage automation tools. Examples include code generation tools like GitHub and CoPilot.
6. Self-Reflecting Agent
A Self-Reflecting Agent thinks about its thinking. It exhibits meta-cognition, which means it evaluates its own thought processes and decision outcomes. It can analyze its reasoning, assess its decisions, and learn from its mistakes. Furthermore, it reflects on its performance after each execution. Thus, Self-Reflecting Agents can solve tasks, explain their reasoning, and improve over time.
This type of agent is best suited for tasks that require accountability and continuous improvement, such as quality assurance tasks.
7. Memory-Enhanced Agent
These agents act as adaptive personal assistants, exemplifying personalization by remembering user preferences and previous interactions and tracking user history. They maintain historical context and can learn over time to provide tailored experiences.
Memory-Enhanced Agents excel at tasks requiring individualized experiences with tailored recommendations, such as customer service bots that track interactions and personalized shopping assistants.
8. Environment Controller Agents
These agents extend beyond decision-making and interaction. Environment Controllers can actively manipulate and control environments in real time. They can perform tasks that influence the digital or physical world, making them ideal for applications in automation, robotics, and adaptive systems.
Environment Controller Agents are adaptive and scalable and can adjust to changing conditions. They employ autonomous learning to refine models and processes based on data, feedback, or environmental changes without manual updates. Such features make them best suited for cutting-edge research and autonomous learning systems.
9. Self-Learning Agents
These agents represent the holy grail of AI. Self-Learning Agents can learn, adapt, and evolve without constant human intervention. They improve themselves over time, learning from new environments and interactions.
Self-Learning Agents combine the elements of memory, reasoning, environment control, and self-reflection with autonomous learning capabilities. These capabilities enable them to adapt and optimize their behavior potentially making them the future of AI.
This last type of agent is best suited for cutting-edge research and autonomous learning systems, such as AI agent swarms, autonomous robotics, and financial prediction models.
The main thing to note about these AI agents is that there is no "one-size-fits-all" solution. The key is to match user needs with the right agent type by determining where each type excels.
However, users still need to gauge where they'll need an AI agent. AI agents are beneficial when tasks require autonomy, adaptability, and complex decision-making. They excel in dynamic workflows that involve multiple steps or interactions that can benefit from automation.
These environments include research and data analysis, financial trading, real-time data processing, and software development. AI agents bring many advantages but are not always the best option. Straightforward tasks that occur infrequently or require minimal automation don't need AI agent involvement.
Evaluating AI agents is like monitoring the work of a new employee. You have to make sure they're doing their job correctly. Without regular checks and feedback, it is not easy to trust the information the agents provide. Nonetheless, remember the goal isn't perfection but establishing AI agents that are reliable, measurable, and continuously improving to deliver consistent value across the four dimensions of Technical Efficiency, Task Completion, Quality Control, and Tool Interaction.
Some AI agents fail to deliver on their potential for the following reasons:
1. Poorly Defined Task
A well-defined persona or task is essential for agents to perform effectively. So, you should specify each agent's goals, constraints, and expected outcomes beforehand.
2. Poor Evaluation Practices
Evaluation helps identify agent weaknesses and ensures they operate reliably in dynamic environments. However, evaluating an agent's performance can be challenging and needs to be continuous to be effective.
3. LLM Issues
It is essential to steer LLMs towards specific tasks for consistent and reliable performance. Effective steering ensures that AI agents will perform accurately and efficiently.
4. Poor Planning
Effective planning is vital for AI agents to perform complicated tasks. Proper planning enables agents to make informed decisions and execute tasks properly.
5. Reasoning Issues
Reasoning capabilities are crucial for agents to make decisions and solve problems; otherwise, they can't understand complex environments.
6. Scaling Problems
Agents' ability to scale is crucial to handle increased workloads and more complex tasks.
7. Fault Tolerance
Agents must be fault-tolerant to recover from errors and continue to operate effectively.
8. Infinite Looping
Agents need looping mechanisms to perform iterative tasks and refine their actions based on feedback, but they can get stuck in loops. Therefore, clear termination criteria are necessary for mechanisms to break out of loops.
AI agents offer tremendous potential to businesses and humankind alike. However, users must match the right agent type with the right task. More importantly, some of the higher-level agents like the Self-Learning type require careful monitoring. Without consistent human observation, evaluation, regulation, and oversight, these types of agents could lead us to the negative future that dystopians fear.