Lesson 8: What are AI Agents?
- The Limitation: Why standard models are "trapped in the box."
- Compound AI Systems: Building a team around the model.
- The Brain: Shift from Hard-coded Logic to Probabilistic Reasoning.
- The Architecture: The ReACT Pattern (Reason, Act, Observe).
If an LLM is a brain in a jar, an AI Agent is that brain connected to a body. It moves us from asking questions ("How do I file my taxes?") to delegating tasks ("File my taxes.").
1. The Problem: The "Stuck in the Box" Issue
A standard ChatGPT session has a hard boundary. The AI can generate text, but it cannot affect the world. It can write an email, but it can't click "Send." It can write code, but it can't "Deploy."
To solve this, we stop treating the Model as the whole product and start treating it as just one component in a Compound AI System.
2. Compound AI Systems
A Compound System surrounds the LLM with three critical components:
- Tools (The Hands): APIs that allow the model to interact with software (Calculator, Calendar, SQL Database).
- Memory (The Notebook): A storage layer (like a Vector Database) to remember past actions and user preferences.
- Verifiers (The Editor): Logical checks to ensure the model's output is safe and correct before the user sees it.
Analogy: The LLM is the CEO. The Compound System is the entire company infrastructure that executes the CEO's vision.
3. The Shift in Control Logic
This is the most technical but most important shift.
-
Traditional Software (Hard-Coded): A human developer writes the path:
If user says "Weather" -> Call Weather API -> Print Result.- Pro: Predictable.
- Con: Brittle. If the user asks "Is it sunny?", the rigid code might fail because it only looks for the keyword "Weather."
-
AI Agents (Probabilistic): We replace the hard-coded logic with the LLM itself.
- Prompt: "You have a Weather Tool. Decide if you need it."
- User: "Should I wear a raincoat?"
- Agent Reason: "Raincoats are for rain. I need to check the weather. Action: Call Weather API."
The Agent decides its own path at runtime based on the goal.
4. The Cognitive Architecture: ReACT
How do we teach a text model to behave like an agent? The industry standard is the ReACT pattern (Reason + Act).
It forces the model into a specific loop:
- Reason: The model "thinks" about the user's request. ("The user wants to know if they have enough vacation days.")
- Act: The model chooses a tool to call. (
check_hr_database(user_id)) - Observe: The model stops and waits for the tool to run. It reads the output. ("Output: 12 days remaining.")
- Iterate: It looks at the new fact and loops back to Reason. ("12 days is enough for a 5-day trip. I can answer now.")
5. Summary
- Standard LLM: "Here is a policy about vacation days." (Text Gen)
- Compound System: "I searched the database and found you have 12 days." (RAG)
- AI Agent: "I see you have 12 days. I have tentatively blocked your calendar for next week and drafted the request email to your boss. Want me to send it?" (Action)
6. Additional Learning Materials
- AI Agents in 38 Minutes - Complete Course from Beginner to Pro – A comprehensive video guide diving deeper into the architecture and implementation of autonomous agents.