Lesson 8: Tool Use & Function Calling
- The Concept: How LLMs "call" functions (spoiler: they don't—you do).
- Tool Definition: Describing functions so LLMs can use them.
- The Loop: Request → Tool Call → Execute → Return → Response.
- Provider APIs: OpenAI and Claude tool use side-by-side.
- Practical Tools: Weather, database, calculator, web search.
- Multi-Tool Agents: Orchestrating multiple tools in one conversation.
- Error Handling: What happens when tools fail.
LLMs can generate text, analyze images, and extract structured data. But they can't do anything—they can't check the weather, query a database, or send an email. Tool use changes that. In this lesson, you'll learn to give LLMs real capabilities by letting them call your functions.
1. How Tool Calling Actually Works
Here's the key insight: LLMs don't actually call functions. They generate a structured request saying "I want to call function X with arguments Y." Your code then:
- Executes the function
- Returns the result to the LLM
- LLM incorporates the result into its response
The LLM's role: Decide WHEN to use a tool and with WHAT arguments. Your code's role: Actually execute the tool and handle errors.
2. Defining Tools
Tools are defined as JSON schemas that describe:
- Name: What to call the function
- Description: When/why to use it (crucial for LLM decision-making)
- Parameters: What arguments it accepts
OpenAI Tool Format
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a city. Use this when the user asks about weather conditions.",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city name, e.g., 'Paris' or 'New York'"
},
"units": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature units"
}
},
"required": ["city"]
}
}
}
]
Claude Tool Format
tools = [
{
"name": "get_weather",
"description": "Get the current weather for a city. Use this when the user asks about weather conditions.",
"input_schema": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city name, e.g., 'Paris' or 'New York'"
},
"units": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature units"
}
},
"required": ["city"]
}
}
]
Key Differences
| Aspect | OpenAI | Claude |
|---|---|---|
| Wrapper | {"type": "function", "function": {...}} | Direct object |
| Schema key | parameters | input_schema |
| Otherwise | Identical JSON Schema | Identical JSON Schema |
3. Your First Tool: OpenAI
Let's build a complete example with a weather tool:
"""
OpenAI Function Calling
=======================
Give GPT the ability to check the weather.
"""
import json
from openai import OpenAI
from dotenv import load_dotenv
load_dotenv()
client = OpenAI()
# ─────────────────────────────────────────────────────────────────────────────
# Step 1: Define Your Tools
# ─────────────────────────────────────────────────────────────────────────────
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a city. Call this when the user asks about weather, temperature, or conditions in a location.",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "City name (e.g., 'London', 'Tokyo', 'New York')"
},
"units": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"default": "celsius"
}
},
"required": ["city"]
}
}
}
]
# ─────────────────────────────────────────────────────────────────────────────
# Step 2: Implement Your Tools
# ─────────────────────────────────────────────────────────────────────────────
def get_weather(city: str, units: str = "celsius") -> dict:
"""
Fake weather API for demonstration.
In production, call a real weather API like OpenWeatherMap.
"""
# Simulated weather data
weather_data = {
"london": {"temp": 12, "condition": "rainy", "humidity": 80},
"tokyo": {"temp": 22, "condition": "sunny", "humidity": 45},
"new york": {"temp": 18, "condition": "cloudy", "humidity": 60},
"paris": {"temp": 15, "condition": "partly cloudy", "humidity": 55},
}
city_lower = city.lower()
if city_lower in weather_data:
data = weather_data[city_lower]
temp = data["temp"]
if units == "fahrenheit":
temp = (temp * 9/5) + 32
return {
"city": city,
"temperature": temp,
"units": units,
"condition": data["condition"],
"humidity": data["humidity"]
}
else:
return {"error": f"Weather data not available for {city}"}
# Map function names to actual functions
TOOL_FUNCTIONS = {
"get_weather": get_weather,
}
# ─────────────────────────────────────────────────────────────────────────────
# Step 3: The Tool Calling Loop
# ─────────────────────────────────────────────────────────────────────────────
def chat_with_tools(user_message: str) -> str:
"""
Chat with the LLM, handling any tool calls it makes.
"""
messages = [{"role": "user", "content": user_message}]
# First API call - might return a tool call
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
tools=tools,
tool_choice="auto", # Let the model decide
)
assistant_message = response.choices[0].message
# Check if the model wants to call a tool
if assistant_message.tool_calls:
# Add assistant's response (with tool calls) to messages
messages.append(assistant_message)
# Process each tool call
for tool_call in assistant_message.tool_calls:
function_name = tool_call.function.name
function_args = json.loads(tool_call.function.arguments)
print(f"🔧 Calling {function_name}({function_args})")
# Execute the function
if function_name in TOOL_FUNCTIONS:
result = TOOL_FUNCTIONS[function_name](**function_args)
else:
result = {"error": f"Unknown function: {function_name}"}
# Add tool result to messages
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result)
})
# Second API call - with tool results
final_response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
tools=tools,
)
return final_response.choices[0].message.content
# No tool call - direct response
return assistant_message.content
if __name__ == "__main__":
# Test queries
queries = [
"What's the weather like in Tokyo?",
"Is it warmer in London or Paris right now?",
"What's 2 + 2?", # No tool needed
]
for query in queries:
print(f"\n{'='*60}")
print(f"User: {query}")
response = chat_with_tools(query)
print(f"Assistant: {response}")
Output
============================================================
User: What's the weather like in Tokyo?
🔧 Calling get_weather({'city': 'Tokyo'})
Assistant: The weather in Tokyo is currently sunny with a temperature of 22°C and 45% humidity.
============================================================
User: Is it warmer in London or Paris right now?
🔧 Calling get_weather({'city': 'London'})
🔧 Calling get_weather({'city': 'Paris'})
Assistant: Paris is slightly warmer at 15°C compared to London at 12°C.
============================================================
User: What's 2 + 2?
Assistant: 2 + 2 equals 4.
4. Claude Tool Use
Claude's API is similar but with key structural differences:
from anthropic import Anthropic
client = Anthropic()
# Tool definitions use "input_schema" instead of "parameters"
tools = [{
"name": "get_weather",
"description": "Get weather for a city",
"input_schema": { # Not "parameters"!
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"]
}
}]
# The tool loop
messages = [{"role": "user", "content": user_message}]
while True:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=tools,
messages=messages,
)
if response.stop_reason == "end_turn":
# Extract final text response
return next(b.text for b in response.content if b.type == "text")
elif response.stop_reason == "tool_use":
# Add assistant response to messages
messages.append({"role": "assistant", "content": response.content})
# Process tool calls
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input) # input is already a dict!
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": json.dumps(result)
})
# Add results as user message
messages.append({"role": "user", "content": tool_results})
Key Differences: OpenAI vs Claude
| Aspect | OpenAI | Claude |
|---|---|---|
| Tool call location | message.tool_calls list | content blocks with type: "tool_use" |
| Arguments format | JSON string (needs json.loads) | Already parsed dict (block.input) |
| Result format | role: "tool" message | role: "user" with tool_result content |
| Stop indicator | Check tool_calls existence | Check stop_reason == "tool_use" |
5. Multiple Tools: Building a Mini Agent
Let's create an agent with multiple capabilities:
"""
Multi-Tool Agent
================
An agent with weather, calculator, and database tools.
"""
import json
import math
from datetime import datetime
from openai import OpenAI
from dotenv import load_dotenv
load_dotenv()
client = OpenAI()
# ─────────────────────────────────────────────────────────────────────────────
# Tool Definitions
# ─────────────────────────────────────────────────────────────────────────────
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"}
},
"required": ["city"]
}
}
},
{
"type": "function",
"function": {
"name": "calculate",
"description": "Perform mathematical calculations. Supports basic arithmetic, trigonometry, and common math functions.",
"parameters": {
"type": "object",
"properties": {
"expression": {
"type": "string",
"description": "Math expression to evaluate (e.g., '2 + 2', 'sqrt(16)', 'sin(pi/2)')"
}
},
"required": ["expression"]
}
}
},
{
"type": "function",
"function": {
"name": "get_current_time",
"description": "Get the current date and time",
"parameters": {
"type": "object",
"properties": {
"timezone": {
"type": "string",
"description": "Timezone (e.g., 'UTC', 'US/Eastern', 'Europe/London')"
}
},
"required": []
}
}
},
{
"type": "function",
"function": {
"name": "search_products",
"description": "Search the product database",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"},
"category": {
"type": "string",
"enum": ["electronics", "clothing", "books", "all"],
"description": "Product category filter"
},
"max_price": {"type": "number", "description": "Maximum price filter"}
},
"required": ["query"]
}
}
}
]
# ─────────────────────────────────────────────────────────────────────────────
# Tool Implementations
# ─────────────────────────────────────────────────────────────────────────────
def get_weather(city: str) -> dict:
"""Simulated weather."""
import random
return {
"city": city,
"temperature": random.randint(10, 30),
"condition": random.choice(["sunny", "cloudy", "rainy"]),
"humidity": random.randint(30, 80)
}
def calculate(expression: str) -> dict:
"""
Safe math expression evaluator.
"""
# Define safe functions
safe_dict = {
"abs": abs,
"round": round,
"min": min,
"max": max,
"sum": sum,
"pow": pow,
"sqrt": math.sqrt,
"sin": math.sin,
"cos": math.cos,
"tan": math.tan,
"log": math.log,
"log10": math.log10,
"exp": math.exp,
"pi": math.pi,
"e": math.e,
}
try:
# Only allow safe operations
result = eval(expression, {"__builtins__": {}}, safe_dict)
return {"expression": expression, "result": result}
except Exception as e:
return {"expression": expression, "error": str(e)}
def get_current_time(timezone: str = "UTC") -> dict:
"""Get current time."""
try:
from zoneinfo import ZoneInfo
tz = ZoneInfo(timezone)
now = datetime.now(tz)
except Exception:
now = datetime.utcnow()
timezone = "UTC"
return {
"timezone": timezone,
"datetime": now.isoformat(),
"date": now.strftime("%Y-%m-%d"),
"time": now.strftime("%H:%M:%S")
}
def search_products(query: str, category: str = "all", max_price: float = None) -> dict:
"""Simulated product search."""
# Fake product database
products = [
{"name": "Laptop Pro", "category": "electronics", "price": 1299},
{"name": "Wireless Mouse", "category": "electronics", "price": 49},
{"name": "Python Handbook", "category": "books", "price": 45},
{"name": "Winter Jacket", "category": "clothing", "price": 120},
{"name": "USB-C Hub", "category": "electronics", "price": 79},
]
# Filter
results = products
if category != "all":
results = [p for p in results if p["category"] == category]
if max_price:
results = [p for p in results if p["price"] <= max_price]
# Search by query
query_lower = query.lower()
results = [p for p in results if query_lower in p["name"].lower()]
return {"query": query, "results": results, "count": len(results)}
TOOL_FUNCTIONS = {
"get_weather": get_weather,
"calculate": calculate,
"get_current_time": get_current_time,
"search_products": search_products,
}
# ─────────────────────────────────────────────────────────────────────────────
# Agent Loop
# ─────────────────────────────────────────────────────────────────────────────
def run_agent(user_message: str, max_iterations: int = 5) -> str:
"""
Run the agent with a maximum number of tool-calling iterations.
Prevents infinite loops if the model keeps calling tools.
"""
messages = [
{
"role": "system",
"content": """You are a helpful assistant with access to tools.
Use the tools when they would help answer the user's question.
Be concise in your responses."""
},
{"role": "user", "content": user_message}
]
for iteration in range(max_iterations):
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
tools=tools,
tool_choice="auto",
)
assistant_message = response.choices[0].message
# No tool calls - we're done
if not assistant_message.tool_calls:
return assistant_message.content
# Process tool calls
messages.append(assistant_message)
print(f"\n[Iteration {iteration + 1}]")
for tool_call in assistant_message.tool_calls:
function_name = tool_call.function.name
function_args = json.loads(tool_call.function.arguments)
print(f" 🔧 {function_name}({json.dumps(function_args)})")
# Execute
result = TOOL_FUNCTIONS.get(function_name, lambda **x: {"error": "Unknown tool"})(**function_args)
print(f" 📤 {json.dumps(result)}")
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result)
})
return "Max iterations reached"
if __name__ == "__main__":
queries = [
"What's the weather in London and what time is it there?",
"Calculate the square root of 144 plus 5 squared",
"Find me electronics under $100",
]
for query in queries:
print(f"\n{'='*70}")
print(f"User: {query}")
result = run_agent(query)
print(f"\nAssistant: {result}")
6. Forcing Tool Use
Sometimes you want to ensure a specific tool is called:
# Let the model decide (default)
tool_choice = "auto"
# Force the model to call a specific tool
tool_choice = {"type": "function", "function": {"name": "get_weather"}}
# Force the model to call SOME tool (any tool)
tool_choice = "required" # OpenAI
tool_choice = {"type": "any"} # Claude
# Prevent any tool calls
tool_choice = "none"
Use Cases
| Scenario | Setting |
|---|---|
| Normal chat with optional tools | auto |
| User explicitly asks for weather | Force get_weather |
| Multi-step workflow requiring data | required |
| Override in specific prompts | none |
7. Error Handling
Tools fail. Networks timeout. APIs return errors. Here's how to handle it:
"""
Robust Tool Execution
=====================
Handle tool failures gracefully.
"""
import json
import traceback
from typing import Callable, Any
from functools import wraps
def safe_tool(func: Callable) -> Callable:
"""
Decorator that catches exceptions and returns error objects.
"""
@wraps(func)
def wrapper(*args, **kwargs) -> dict:
try:
result = func(*args, **kwargs)
return {"success": True, "data": result}
except Exception as e:
return {
"success": False,
"error": str(e),
"error_type": type(e).__name__,
}
return wrapper
@safe_tool
def get_weather_real(city: str) -> dict:
"""Real weather API call (might fail)."""
import requests
# This would be a real API call
response = requests.get(
f"https://api.weather.example.com/{city}",
timeout=5
)
response.raise_for_status()
return response.json()
def execute_tool_safely(
function_name: str,
function_args: dict,
tool_functions: dict,
) -> str:
"""
Execute a tool with comprehensive error handling.
Returns a JSON string suitable for the tool result.
"""
if function_name not in tool_functions:
return json.dumps({
"error": f"Unknown tool: {function_name}",
"available_tools": list(tool_functions.keys())
})
try:
func = tool_functions[function_name]
result = func(**function_args)
return json.dumps(result)
except TypeError as e:
# Wrong arguments
return json.dumps({
"error": f"Invalid arguments: {e}",
"received_args": function_args
})
except Exception as e:
# Any other error
return json.dumps({
"error": f"Tool execution failed: {e}",
"error_type": type(e).__name__,
})
# ─────────────────────────────────────────────────────────────────────────────
# Retry Logic
# ─────────────────────────────────────────────────────────────────────────────
def execute_with_retry(
func: Callable,
args: dict,
max_retries: int = 3,
delay: float = 1.0,
) -> Any:
"""Execute a tool with retries for transient failures."""
import time
last_error = None
for attempt in range(max_retries):
try:
return func(**args)
except Exception as e:
last_error = e
if attempt < max_retries - 1:
time.sleep(delay * (attempt + 1)) # Exponential backoff
raise last_error
Letting the LLM Know About Errors
When a tool fails, return a helpful error message so the LLM can:
- Inform the user
- Try a different approach
- Ask for different input
# Bad: Just crash or return nothing
result = None
# Good: Return structured error
result = {
"error": "City not found",
"message": "Could not find weather data for 'Atlantis'. Please try a real city.",
"suggestion": "Try cities like 'London', 'Tokyo', or 'New York'"
}
8. Parallel & Practical Tools
Parallel Tool Calls: Modern models can request multiple tools at once. Process them with asyncio.gather() for concurrent execution.
Database Tools: Wrap SQL queries with security checks (SELECT only, allowed tables). Return schema info via a separate tool.
Web Search Tools: Useful pattern for current events. Always include num_results parameter.
9. Tool Use Best Practices
Writing Good Tool Descriptions
# ❌ Bad: Vague description
{
"name": "do_thing",
"description": "Does a thing"
}
# ✅ Good: Clear when to use and what it does
{
"name": "get_stock_price",
"description": "Get the current stock price for a ticker symbol. Use this when the user asks about stock prices, market values, or share prices. Returns price in USD."
}
Parameter Descriptions Matter
# ❌ Bad: No context
"properties": {
"date": {"type": "string"}
}
# ✅ Good: Format and examples
"properties": {
"date": {
"type": "string",
"description": "Date in YYYY-MM-DD format, e.g., '2024-01-15'"
}
}
Use Enums When Possible
# ❌ Bad: Free-form string
"units": {"type": "string", "description": "Temperature units"}
# ✅ Good: Constrained options
"units": {
"type": "string",
"enum": ["celsius", "fahrenheit", "kelvin"],
"description": "Temperature units"
}
10. Common Pitfalls
| Symptom | Cause | Fix |
|---|---|---|
| Model never calls tools | Description unclear about when to use | Add "Use this when..." to description |
| Wrong arguments passed | Poor parameter descriptions | Add examples and constraints |
| Model keeps calling tools | No termination condition | Add max_iterations limit |
| Tool results ignored | Not added to message history | Ensure tool results are in messages |
| "I don't have access to tools" | Tools not in API call | Pass tools= parameter |
| JSON parse errors | Model returned malformed args | Use try/catch, return error to model |
11. Try It Yourself
Challenge 1: Email Tool
Build a tool that can "send emails" (simulated):
def send_email(to: str, subject: str, body: str) -> dict:
# Validate email format
# Log the "sent" email
# Return confirmation
pass
Challenge 2: Multi-Step Agent
Create an agent that can:
- Search for products
- Compare prices
- Add to cart
- Checkout
Each step should be a separate tool.
Challenge 3: Tool Chaining
Build tools that can call other tools:
analyze_website(url)→ fetches URL, then callssummarize_text()- Handle the nested tool calls properly
12. Key Takeaways
-
LLMs don't call functions—they request calls. You execute and return results.
-
Descriptions are crucial. The model decides when to use tools based on descriptions.
-
Always add the assistant message before tool results. The message history must be complete.
-
Handle errors gracefully. Return structured errors so the model can adapt.
-
Limit iterations. Prevent infinite loops with a max_iterations counter.
-
Parallel execution is possible. Models can request multiple tools at once.
-
Security matters. Validate inputs, restrict database access, sanitize queries.
13. What's Next
You've given LLMs the ability to take actions. In Lesson 9: Introduction to MCP, we'll explore the Model Context Protocol—a standardized way to connect LLMs to data sources and tools that works across different applications and providers.
14. Additional Resources
- OpenAI Function Calling Guide — Official documentation
- Anthropic Tool Use Guide — Claude tool use docs
- LangChain Tools — Tool abstractions library
- Function Calling Patterns — OpenAI cookbook