Lesson 12: The AI Tech Stack
- The Ecosystem: Why a model is not a product.
- Layer 1-3: Infrastructure, Models, and Data.
- Layer 4 (Orchestration): The "Architects" (LangChain, AutoGen, LlamaIndex).
- The Vertical Layer: The "Inspectors" (Arize Phoenix, MLflow).
- Layer 5 (Application): The interface.
Building an AI application with just an LLM is like building a car with just an engine. It might make noise, but it won't go anywhere. To build a reliable, scalable system, you need a full AI Stack.
1. The Stack Overview
We divide the ecosystem into five horizontal layers that build on each other, plus one "vertical" layer that monitors everything.
- Layer 5: Application (The Interface)
- Layer 4: Orchestration (The Brain & Logic)
- Layer 3: Data (The Context)
- Layer 2: Models (The Intelligence)
- Layer 1: Infrastructure (The Compute)
- Vertical Layer: Observability (The Monitor)
2. Layer 1: Infrastructure (The Metal)
This is the physical foundation. AI models are computationally heavy; they require GPUs (Graphics Processing Units).
- Cloud (API): You rent inference from a provider (OpenAI, Azure, AWS). Pro: Easy. Con: Data privacy concerns.
- On-Premise / Private Cloud: You host open-source models (like Llama 3) on your own servers. Pro: Privacy. Con: Expensive hardware management.
- Edge AI: Running small models directly on a user's laptop or phone. Pro: Zero latency. Con: Limited intelligence.
3. Layer 2: Models (The Engines)
This layer provides the raw intelligence. We don't just pick "the best" model; we pick the right tool for the budget.
- Proprietary Models (GPT-4, Claude): High intelligence, high cost. Best for complex reasoning.
- Open Models (Llama, Mistral): Free to use code, but you pay for hosting. Best for fine-tuning.
- SLMs (Small Language Models): Tiny models designed for specific tasks (like summarizing text) that run cheaply and fast.
4. Layer 3: Data (The Business Value)
A generic model knows English, but it doesn't know Your Business. The Data Layer bridges this gap using RAG (Retrieval-Augmented Generation).
- Vector Databases: (Qdrant, Pinecone) Store your PDFs, wikis, and emails as mathematical embeddings.
- Knowledge Graphs: Map relationships between entities (e.g., "Alice" manages "Project X").
- Function: This layer provides the Context (Lesson 5). Without this, the model creates hallucinations.
5. Layer 4: Orchestration (The Architects)
This is where the logic lives. It manages the "Sense-Think-Act" loop (Lesson 9). These frameworks are the Architects—they define the plan.
- LangChain / Semantic Kernel: The standard glue code for chaining prompts and managing memory.
- Microsoft AutoGen / CrewAI: Frameworks for Multi-Agent Systems. They allow you to spawn specialized agents (e.g., a "Coder" and a "Reviewer") that talk to each other to solve problems.
- LlamaIndex: Specialized for data-heavy agents that need complex query planning over documents.
6. The Vertical Layer: Observability (The Inspectors)
You cannot ship a system if you can't see inside it. Because LLMs are probabilistic (random), they fail in weird ways. This layer creates a "Black Box Recorder."
- Arize Phoenix: An observability platform for Tracing. It visualizes the entire chain of thought: What did the user ask? What document did we retrieve? Why did the tool fail?
- MLflow: An MLOps platform for Lifecycle Management. It tracks your experiments: Which prompt version performed best? Did the new model update break our accuracy?
The Rule: Orchestrators (Layer 4) do the work. Observability tools (Vertical Layer) watch the work to ensure quality.
7. Layer 5: Application (The Interface)
The top layer is what the user touches.
- Beyond Chat: The best AI apps often have "Invisible AI"—buttons that auto-fill forms or categorize emails without a chat window.
- Integration: Connecting the AI's output to real-world APIs (Slack, Jira, CRM) to deliver the final outcome.