Lesson 8: Tool Use & Function Calling

Topics Covered

The Concept: How LLMs "call" functions (spoiler: they don't—you do).
Tool Definition: Describing functions so LLMs can use them.
The Loop: Request → Tool Call → Execute → Return → Response.
Provider APIs: OpenAI and Claude tool use side-by-side.
Practical Tools: Weather, database, calculator, web search.
Multi-Tool Agents: Orchestrating multiple tools in one conversation.
Error Handling: What happens when tools fail.

LLMs can generate text, analyze images, and extract structured data. But they can't do anything—they can't check the weather, query a database, or send an email. Tool use changes that. In this lesson, you'll learn to give LLMs real capabilities by letting them call your functions.

1. How Tool Calling Actually Works

Here's the key insight: LLMs don't actually call functions. They generate a structured request saying "I want to call function X with arguments Y." Your code then:

Executes the function
Returns the result to the LLM
LLM incorporates the result into its response

The LLM's role: Decide WHEN to use a tool and with WHAT arguments. Your code's role: Actually execute the tool and handle errors.

2. Defining Tools

Tools are defined as JSON schemas that describe:

Name: What to call the function
Description: When/why to use it (crucial for LLM decision-making)
Parameters: What arguments it accepts

OpenAI Tool Format

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a city. Use this when the user asks about weather conditions.",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "The city name, e.g., 'Paris' or 'New York'"
                    },
                    "units": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature units"
                    }
                },
                "required": ["city"]
            }
        }
    }
]

Claude Tool Format

tools = [
    {
        "name": "get_weather",
        "description": "Get the current weather for a city. Use this when the user asks about weather conditions.",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "The city name, e.g., 'Paris' or 'New York'"
                },
                "units": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature units"
                }
            },
            "required": ["city"]
        }
    }
]

Key Differences

Aspect	OpenAI	Claude
Wrapper	`{"type": "function", "function": {...}}`	Direct object
Schema key	`parameters`	`input_schema`
Otherwise	Identical JSON Schema	Identical JSON Schema

3. Your First Tool: OpenAI

Let's build a complete example with a weather tool:

openai_tools.py
"""
OpenAI Function Calling
=======================
Give GPT the ability to check the weather.
"""

import json
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

client = OpenAI()

# ─────────────────────────────────────────────────────────────────────────────
# Step 1: Define Your Tools
# ─────────────────────────────────────────────────────────────────────────────

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a city. Call this when the user asks about weather, temperature, or conditions in a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "City name (e.g., 'London', 'Tokyo', 'New York')"
                    },
                    "units": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "default": "celsius"
                    }
                },
                "required": ["city"]
            }
        }
    }
]

# ─────────────────────────────────────────────────────────────────────────────
# Step 2: Implement Your Tools
# ─────────────────────────────────────────────────────────────────────────────

def get_weather(city: str, units: str = "celsius") -> dict:
    """
    Fake weather API for demonstration.
    In production, call a real weather API like OpenWeatherMap.
    """
    # Simulated weather data
    weather_data = {
        "london": {"temp": 12, "condition": "rainy", "humidity": 80},
        "tokyo": {"temp": 22, "condition": "sunny", "humidity": 45},
        "new york": {"temp": 18, "condition": "cloudy", "humidity": 60},
        "paris": {"temp": 15, "condition": "partly cloudy", "humidity": 55},
    }
    
    city_lower = city.lower()
    if city_lower in weather_data:
        data = weather_data[city_lower]
        temp = data["temp"]
        if units == "fahrenheit":
            temp = (temp * 9/5) + 32
        return {
            "city": city,
            "temperature": temp,
            "units": units,
            "condition": data["condition"],
            "humidity": data["humidity"]
        }
    else:
        return {"error": f"Weather data not available for {city}"}


# Map function names to actual functions
TOOL_FUNCTIONS = {
    "get_weather": get_weather,
}

# ─────────────────────────────────────────────────────────────────────────────
# Step 3: The Tool Calling Loop
# ─────────────────────────────────────────────────────────────────────────────

def chat_with_tools(user_message: str) -> str:
    """
    Chat with the LLM, handling any tool calls it makes.
    """
    messages = [{"role": "user", "content": user_message}]
    
    # First API call - might return a tool call
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        tools=tools,
        tool_choice="auto",  # Let the model decide
    )
    
    assistant_message = response.choices[0].message
    
    # Check if the model wants to call a tool
    if assistant_message.tool_calls:
        # Add assistant's response (with tool calls) to messages
        messages.append(assistant_message)
        
        # Process each tool call
        for tool_call in assistant_message.tool_calls:
            function_name = tool_call.function.name
            function_args = json.loads(tool_call.function.arguments)
            
            print(f"🔧 Calling {function_name}({function_args})")
            
            # Execute the function
            if function_name in TOOL_FUNCTIONS:
                result = TOOL_FUNCTIONS[function_name](**function_args)
            else:
                result = {"error": f"Unknown function: {function_name}"}
            
            # Add tool result to messages
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": json.dumps(result)
            })
        
        # Second API call - with tool results
        final_response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            tools=tools,
        )
        
        return final_response.choices[0].message.content
    
    # No tool call - direct response
    return assistant_message.content


if __name__ == "__main__":
    # Test queries
    queries = [
        "What's the weather like in Tokyo?",
        "Is it warmer in London or Paris right now?",
        "What's 2 + 2?",  # No tool needed
    ]
    
    for query in queries:
        print(f"\n{'='*60}")
        print(f"User: {query}")
        response = chat_with_tools(query)
        print(f"Assistant: {response}")

Output

============================================================
User: What's the weather like in Tokyo?
🔧 Calling get_weather({'city': 'Tokyo'})
Assistant: The weather in Tokyo is currently sunny with a temperature of 22°C and 45% humidity.

============================================================
User: Is it warmer in London or Paris right now?
🔧 Calling get_weather({'city': 'London'})
🔧 Calling get_weather({'city': 'Paris'})
Assistant: Paris is slightly warmer at 15°C compared to London at 12°C.

============================================================
User: What's 2 + 2?
Assistant: 2 + 2 equals 4.

4. Claude Tool Use

Claude's API is similar but with key structural differences:

claude_tools.py (key differences)
from anthropic import Anthropic
client = Anthropic()

# Tool definitions use "input_schema" instead of "parameters"
tools = [{
    "name": "get_weather",
    "description": "Get weather for a city",
    "input_schema": {  # Not "parameters"!
        "type": "object",
        "properties": {"city": {"type": "string"}},
        "required": ["city"]
    }
}]

# The tool loop
messages = [{"role": "user", "content": user_message}]

while True:
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        tools=tools,
        messages=messages,
    )
    
    if response.stop_reason == "end_turn":
        # Extract final text response
        return next(b.text for b in response.content if b.type == "text")
    
    elif response.stop_reason == "tool_use":
        # Add assistant response to messages
        messages.append({"role": "assistant", "content": response.content})
        
        # Process tool calls
        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)  # input is already a dict!
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": json.dumps(result)
                })
        
        # Add results as user message
        messages.append({"role": "user", "content": tool_results})

Key Differences: OpenAI vs Claude

Aspect	OpenAI	Claude
Tool call location	`message.tool_calls` list	`content` blocks with `type: "tool_use"`
Arguments format	JSON string (needs `json.loads`)	Already parsed dict (`block.input`)
Result format	`role: "tool"` message	`role: "user"` with `tool_result` content
Stop indicator	Check `tool_calls` existence	Check `stop_reason == "tool_use"`

5. Multiple Tools: Building a Mini Agent

Let's create an agent with multiple capabilities:

multi_tool_agent.py
"""
Multi-Tool Agent
================
An agent with weather, calculator, and database tools.
"""

import json
import math
from datetime import datetime
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

client = OpenAI()

# ─────────────────────────────────────────────────────────────────────────────
# Tool Definitions
# ─────────────────────────────────────────────────────────────────────────────

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"}
                },
                "required": ["city"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "Perform mathematical calculations. Supports basic arithmetic, trigonometry, and common math functions.",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "Math expression to evaluate (e.g., '2 + 2', 'sqrt(16)', 'sin(pi/2)')"
                    }
                },
                "required": ["expression"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_current_time",
            "description": "Get the current date and time",
            "parameters": {
                "type": "object",
                "properties": {
                    "timezone": {
                        "type": "string",
                        "description": "Timezone (e.g., 'UTC', 'US/Eastern', 'Europe/London')"
                    }
                },
                "required": []
            }
        }
    },
    {
        "type": "function", 
        "function": {
            "name": "search_products",
            "description": "Search the product database",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "Search query"},
                    "category": {
                        "type": "string",
                        "enum": ["electronics", "clothing", "books", "all"],
                        "description": "Product category filter"
                    },
                    "max_price": {"type": "number", "description": "Maximum price filter"}
                },
                "required": ["query"]
            }
        }
    }
]

# ─────────────────────────────────────────────────────────────────────────────
# Tool Implementations
# ─────────────────────────────────────────────────────────────────────────────

def get_weather(city: str) -> dict:
    """Simulated weather."""
    import random
    return {
        "city": city,
        "temperature": random.randint(10, 30),
        "condition": random.choice(["sunny", "cloudy", "rainy"]),
        "humidity": random.randint(30, 80)
    }


def calculate(expression: str) -> dict:
    """
    Safe math expression evaluator.
    """
    # Define safe functions
    safe_dict = {
        "abs": abs,
        "round": round,
        "min": min,
        "max": max,
        "sum": sum,
        "pow": pow,
        "sqrt": math.sqrt,
        "sin": math.sin,
        "cos": math.cos,
        "tan": math.tan,
        "log": math.log,
        "log10": math.log10,
        "exp": math.exp,
        "pi": math.pi,
        "e": math.e,
    }
    
    try:
        # Only allow safe operations
        result = eval(expression, {"__builtins__": {}}, safe_dict)
        return {"expression": expression, "result": result}
    except Exception as e:
        return {"expression": expression, "error": str(e)}


def get_current_time(timezone: str = "UTC") -> dict:
    """Get current time."""
    try:
        from zoneinfo import ZoneInfo
        tz = ZoneInfo(timezone)
        now = datetime.now(tz)
    except Exception:
        now = datetime.utcnow()
        timezone = "UTC"
    
    return {
        "timezone": timezone,
        "datetime": now.isoformat(),
        "date": now.strftime("%Y-%m-%d"),
        "time": now.strftime("%H:%M:%S")
    }


def search_products(query: str, category: str = "all", max_price: float = None) -> dict:
    """Simulated product search."""
    # Fake product database
    products = [
        {"name": "Laptop Pro", "category": "electronics", "price": 1299},
        {"name": "Wireless Mouse", "category": "electronics", "price": 49},
        {"name": "Python Handbook", "category": "books", "price": 45},
        {"name": "Winter Jacket", "category": "clothing", "price": 120},
        {"name": "USB-C Hub", "category": "electronics", "price": 79},
    ]
    
    # Filter
    results = products
    if category != "all":
        results = [p for p in results if p["category"] == category]
    if max_price:
        results = [p for p in results if p["price"] <= max_price]
    
    # Search by query
    query_lower = query.lower()
    results = [p for p in results if query_lower in p["name"].lower()]
    
    return {"query": query, "results": results, "count": len(results)}


TOOL_FUNCTIONS = {
    "get_weather": get_weather,
    "calculate": calculate,
    "get_current_time": get_current_time,
    "search_products": search_products,
}

# ─────────────────────────────────────────────────────────────────────────────
# Agent Loop
# ─────────────────────────────────────────────────────────────────────────────

def run_agent(user_message: str, max_iterations: int = 5) -> str:
    """
    Run the agent with a maximum number of tool-calling iterations.
    
    Prevents infinite loops if the model keeps calling tools.
    """
    messages = [
        {
            "role": "system",
            "content": """You are a helpful assistant with access to tools.
Use the tools when they would help answer the user's question.
Be concise in your responses."""
        },
        {"role": "user", "content": user_message}
    ]
    
    for iteration in range(max_iterations):
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            tools=tools,
            tool_choice="auto",
        )
        
        assistant_message = response.choices[0].message
        
        # No tool calls - we're done
        if not assistant_message.tool_calls:
            return assistant_message.content
        
        # Process tool calls
        messages.append(assistant_message)
        
        print(f"\n[Iteration {iteration + 1}]")
        
        for tool_call in assistant_message.tool_calls:
            function_name = tool_call.function.name
            function_args = json.loads(tool_call.function.arguments)
            
            print(f"  🔧 {function_name}({json.dumps(function_args)})")
            
            # Execute
            result = TOOL_FUNCTIONS.get(function_name, lambda **x: {"error": "Unknown tool"})(**function_args)
            
            print(f"  📤 {json.dumps(result)}")
            
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": json.dumps(result)
            })
    
    return "Max iterations reached"


if __name__ == "__main__":
    queries = [
        "What's the weather in London and what time is it there?",
        "Calculate the square root of 144 plus 5 squared",
        "Find me electronics under $100",
    ]
    
    for query in queries:
        print(f"\n{'='*70}")
        print(f"User: {query}")
        result = run_agent(query)
        print(f"\nAssistant: {result}")

6. Forcing Tool Use

Sometimes you want to ensure a specific tool is called:

# Let the model decide (default)
tool_choice = "auto"

# Force the model to call a specific tool
tool_choice = {"type": "function", "function": {"name": "get_weather"}}

# Force the model to call SOME tool (any tool)
tool_choice = "required"  # OpenAI
tool_choice = {"type": "any"}  # Claude

# Prevent any tool calls
tool_choice = "none"

Use Cases

Scenario	Setting
Normal chat with optional tools	`auto`
User explicitly asks for weather	Force `get_weather`
Multi-step workflow requiring data	`required`
Override in specific prompts	`none`

7. Error Handling

Tools fail. Networks timeout. APIs return errors. Here's how to handle it:

robust_tools.py
"""
Robust Tool Execution
=====================
Handle tool failures gracefully.
"""

import json
import traceback
from typing import Callable, Any
from functools import wraps


def safe_tool(func: Callable) -> Callable:
    """
    Decorator that catches exceptions and returns error objects.
    """
    @wraps(func)
    def wrapper(*args, **kwargs) -> dict:
        try:
            result = func(*args, **kwargs)
            return {"success": True, "data": result}
        except Exception as e:
            return {
                "success": False,
                "error": str(e),
                "error_type": type(e).__name__,
            }
    return wrapper


@safe_tool
def get_weather_real(city: str) -> dict:
    """Real weather API call (might fail)."""
    import requests
    
    # This would be a real API call
    response = requests.get(
        f"https://api.weather.example.com/{city}",
        timeout=5
    )
    response.raise_for_status()
    return response.json()


def execute_tool_safely(
    function_name: str,
    function_args: dict,
    tool_functions: dict,
) -> str:
    """
    Execute a tool with comprehensive error handling.
    
    Returns a JSON string suitable for the tool result.
    """
    if function_name not in tool_functions:
        return json.dumps({
            "error": f"Unknown tool: {function_name}",
            "available_tools": list(tool_functions.keys())
        })
    
    try:
        func = tool_functions[function_name]
        result = func(**function_args)
        return json.dumps(result)
    
    except TypeError as e:
        # Wrong arguments
        return json.dumps({
            "error": f"Invalid arguments: {e}",
            "received_args": function_args
        })
    
    except Exception as e:
        # Any other error
        return json.dumps({
            "error": f"Tool execution failed: {e}",
            "error_type": type(e).__name__,
        })


# ─────────────────────────────────────────────────────────────────────────────
# Retry Logic
# ─────────────────────────────────────────────────────────────────────────────

def execute_with_retry(
    func: Callable,
    args: dict,
    max_retries: int = 3,
    delay: float = 1.0,
) -> Any:
    """Execute a tool with retries for transient failures."""
    import time
    
    last_error = None
    
    for attempt in range(max_retries):
        try:
            return func(**args)
        except Exception as e:
            last_error = e
            if attempt < max_retries - 1:
                time.sleep(delay * (attempt + 1))  # Exponential backoff
    
    raise last_error

Letting the LLM Know About Errors

When a tool fails, return a helpful error message so the LLM can:

Inform the user
Try a different approach
Ask for different input

# Bad: Just crash or return nothing
result = None

# Good: Return structured error
result = {
    "error": "City not found",
    "message": "Could not find weather data for 'Atlantis'. Please try a real city.",
    "suggestion": "Try cities like 'London', 'Tokyo', or 'New York'"
}

8. Parallel & Practical Tools

Advanced Patterns

Parallel Tool Calls: Modern models can request multiple tools at once. Process them with asyncio.gather() for concurrent execution.

Database Tools: Wrap SQL queries with security checks (SELECT only, allowed tables). Return schema info via a separate tool.

Web Search Tools: Useful pattern for current events. Always include num_results parameter.

9. Tool Use Best Practices

Writing Good Tool Descriptions

# ❌ Bad: Vague description
{
    "name": "do_thing",
    "description": "Does a thing"
}

# ✅ Good: Clear when to use and what it does
{
    "name": "get_stock_price",
    "description": "Get the current stock price for a ticker symbol. Use this when the user asks about stock prices, market values, or share prices. Returns price in USD."
}

Parameter Descriptions Matter

# ❌ Bad: No context
"properties": {
    "date": {"type": "string"}
}

# ✅ Good: Format and examples
"properties": {
    "date": {
        "type": "string",
        "description": "Date in YYYY-MM-DD format, e.g., '2024-01-15'"
    }
}

Use Enums When Possible

# ❌ Bad: Free-form string
"units": {"type": "string", "description": "Temperature units"}

# ✅ Good: Constrained options
"units": {
    "type": "string",
    "enum": ["celsius", "fahrenheit", "kelvin"],
    "description": "Temperature units"
}

10. Common Pitfalls

Symptom	Cause	Fix
Model never calls tools	Description unclear about when to use	Add "Use this when..." to description
Wrong arguments passed	Poor parameter descriptions	Add examples and constraints
Model keeps calling tools	No termination condition	Add max_iterations limit
Tool results ignored	Not added to message history	Ensure tool results are in messages
"I don't have access to tools"	Tools not in API call	Pass `tools=` parameter
JSON parse errors	Model returned malformed args	Use try/catch, return error to model

11. Try It Yourself

Challenge 1: Email Tool

Build a tool that can "send emails" (simulated):

def send_email(to: str, subject: str, body: str) -> dict:
    # Validate email format
    # Log the "sent" email
    # Return confirmation
    pass

Challenge 2: Multi-Step Agent

Create an agent that can:

Search for products
Compare prices
Add to cart
Checkout

Each step should be a separate tool.

Challenge 3: Tool Chaining

Build tools that can call other tools:

analyze_website(url) → fetches URL, then calls summarize_text()
Handle the nested tool calls properly

12. Key Takeaways

LLMs don't call functions—they request calls. You execute and return results.
Descriptions are crucial. The model decides when to use tools based on descriptions.
Always add the assistant message before tool results. The message history must be complete.
Handle errors gracefully. Return structured errors so the model can adapt.
Limit iterations. Prevent infinite loops with a max_iterations counter.
Parallel execution is possible. Models can request multiple tools at once.
Security matters. Validate inputs, restrict database access, sanitize queries.

13. What's Next

You've given LLMs the ability to take actions. In Lesson 9: Introduction to MCP, we'll explore the Model Context Protocol—a standardized way to connect LLMs to data sources and tools that works across different applications and providers.

14. Additional Resources

OpenAI Function Calling Guide — Official documentation
Anthropic Tool Use Guide — Claude tool use docs
LangChain Tools — Tool abstractions library
Function Calling Patterns — OpenAI cookbook

1. How Tool Calling Actually Works​

2. Defining Tools​

OpenAI Tool Format​

Claude Tool Format​

Key Differences​

3. Your First Tool: OpenAI​

Output​

4. Claude Tool Use​

Key Differences: OpenAI vs Claude​

5. Multiple Tools: Building a Mini Agent​

6. Forcing Tool Use​

Use Cases​

7. Error Handling​

Letting the LLM Know About Errors​

8. Parallel & Practical Tools​

9. Tool Use Best Practices​

Writing Good Tool Descriptions​

Parameter Descriptions Matter​

Use Enums When Possible​

10. Common Pitfalls​

11. Try It Yourself​

Challenge 1: Email Tool​

Challenge 2: Multi-Step Agent​

Challenge 3: Tool Chaining​

12. Key Takeaways​

13. What's Next​

14. Additional Resources​