Skip to main content

Lesson 12: The AI Tech Stack

Topics Covered
  • The Ecosystem: Why a model is not a product.
  • Layer 1-3: Infrastructure, Models, and Data.
  • Layer 4 (Orchestration): The "Architects" (LangChain, AutoGen, LlamaIndex).
  • The Vertical Layer: The "Inspectors" (Arize Phoenix, MLflow).
  • Layer 5 (Application): The interface.

Building an AI application with just an LLM is like building a car with just an engine. It might make noise, but it won't go anywhere. To build a reliable, scalable system, you need a full AI Stack.

1. The Stack Overview

We divide the ecosystem into five horizontal layers that build on each other, plus one "vertical" layer that monitors everything.

  • Layer 5: Application (The Interface)
  • Layer 4: Orchestration (The Brain & Logic)
  • Layer 3: Data (The Context)
  • Layer 2: Models (The Intelligence)
  • Layer 1: Infrastructure (The Compute)
  • Vertical Layer: Observability (The Monitor)

2. Layer 1: Infrastructure (The Metal)

This is the physical foundation. AI models are computationally heavy; they require GPUs (Graphics Processing Units).

  • Cloud (API): You rent inference from a provider (OpenAI, Azure, AWS). Pro: Easy. Con: Data privacy concerns.
  • On-Premise / Private Cloud: You host open-source models (like Llama 3) on your own servers. Pro: Privacy. Con: Expensive hardware management.
  • Edge AI: Running small models directly on a user's laptop or phone. Pro: Zero latency. Con: Limited intelligence.

3. Layer 2: Models (The Engines)

This layer provides the raw intelligence. We don't just pick "the best" model; we pick the right tool for the budget.

  • Proprietary Models (GPT-4, Claude): High intelligence, high cost. Best for complex reasoning.
  • Open Models (Llama, Mistral): Free to use code, but you pay for hosting. Best for fine-tuning.
  • SLMs (Small Language Models): Tiny models designed for specific tasks (like summarizing text) that run cheaply and fast.

4. Layer 3: Data (The Business Value)

A generic model knows English, but it doesn't know Your Business. The Data Layer bridges this gap using RAG (Retrieval-Augmented Generation).

  • Vector Databases: (Qdrant, Pinecone) Store your PDFs, wikis, and emails as mathematical embeddings.
  • Knowledge Graphs: Map relationships between entities (e.g., "Alice" manages "Project X").
  • Function: This layer provides the Context (Lesson 5). Without this, the model creates hallucinations.

5. Layer 4: Orchestration (The Architects)

This is where the logic lives. It manages the "Sense-Think-Act" loop (Lesson 9). These frameworks are the Architects—they define the plan.

  • LangChain / Semantic Kernel: The standard glue code for chaining prompts and managing memory.
  • Microsoft AutoGen / CrewAI: Frameworks for Multi-Agent Systems. They allow you to spawn specialized agents (e.g., a "Coder" and a "Reviewer") that talk to each other to solve problems.
  • LlamaIndex: Specialized for data-heavy agents that need complex query planning over documents.

6. The Vertical Layer: Observability (The Inspectors)

You cannot ship a system if you can't see inside it. Because LLMs are probabilistic (random), they fail in weird ways. This layer creates a "Black Box Recorder."

  • Arize Phoenix: An observability platform for Tracing. It visualizes the entire chain of thought: What did the user ask? What document did we retrieve? Why did the tool fail?
  • MLflow: An MLOps platform for Lifecycle Management. It tracks your experiments: Which prompt version performed best? Did the new model update break our accuracy?

The Rule: Orchestrators (Layer 4) do the work. Observability tools (Vertical Layer) watch the work to ensure quality.

7. Layer 5: Application (The Interface)

The top layer is what the user touches.

  • Beyond Chat: The best AI apps often have "Invisible AI"—buttons that auto-fill forms or categorize emails without a chat window.
  • Integration: Connecting the AI's output to real-world APIs (Slack, Jira, CRM) to deliver the final outcome.