FOUNDATION35 min65 sections

Designing Agent Tools

THIS WEEK'S JOURNEY

Building the Bridge Between Agent Intelligence and Real-World Action

Tools are the hands and eyes of your AI agents—they transform intelligent reasoning into tangible actions that interact with databases, APIs, and external systems. In production environments on AWS, poorly designed tools become the primary source of agent failures, accounting for up to 67% of all agent errors according to internal studies at major AI companies.

Key Insight

Tools Are Contracts, Not Just Functions

The fundamental shift in thinking about agent tools is recognizing them as contracts between your agent and external systems, not merely function wrappers. When Anthropic's Claude or OpenAI's GPT-4 selects and invokes a tool, it relies entirely on the description and schema you provide—there's no runtime inspection or documentation lookup.

73%

of agent failures traced to tool design issues

When Anthropic analyzed failure modes across their enterprise Claude deployments, nearly three-quarters of all agent failures stemmed from tool-related issues rather than reasoning errors.

Framework

The CLEAR Tool Design Framework

Contextual

Every tool description must provide context about when and why to use it, not just what it does. Inc...

Literal

Use precise, unambiguous language that leaves no room for interpretation. Avoid metaphors, colloquia...

Exhaustive

Document every parameter, every return value, every error condition. Agents cannot infer missing inf...

Actionable

Describe the concrete action the tool performs in active voice. 'Creates a new customer record in th...

Tool Descriptions: Vague vs. Production-Ready

Vague Description (Failure-Prone)

get_user: Gets user information

No parameter context or constraints

Missing return value documentation

No error handling guidance

Production-Ready Description

get_user: Retrieves complete profile for a single user by th...

Parameters: user_id (string, UUID v4 format, required)

Returns: UserProfile object with name, email, created_at, su...

Errors: Returns null if user not found, throws AuthError if ...

Stripe

Redesigning Payment Tools for Agent Reliability

Agent-related support errors dropped by 89% within 30 days of deploying the rede...

Production Tool Definition with AWS Lambdatypescript

123456789101112
import { Tool, ToolParameter } from '@anthropic/sdk';

const createSupportTicketTool: Tool = {
  name: 'create_support_ticket',
  description: `Creates a new support ticket in the AWS-hosted helpdesk system.
    
    USE THIS TOOL WHEN:
    - User reports a new issue or problem
    - User requests help with a feature
    - User needs to escalate a concern
    
    DO NOT USE WHEN:

Description Length Directly Correlates with Reliability

Internal benchmarks at Anthropic and OpenAI show that tool descriptions under 100 words have 3x higher error rates than descriptions over 300 words. Don't be afraid of verbose descriptions—agents process them efficiently, and the additional context dramatically improves tool selection accuracy.

Key Insight

Parameter Schemas Are Your First Line of Defense

JSON Schema validation for tool parameters catches errors before they reach your backend systems, preventing cascading failures and data corruption. On AWS, you can leverage API Gateway's request validation combined with Lambda's input processing to create multiple validation layers.

Designing Bulletproof Parameter Schemas

Start with Business Requirements

Choose Precise Types

Add Semantic Constraints

Document Every Property

Handle Optional vs Required Explicitly

Anti-Pattern: The 'Stringly Typed' Parameter Trap

❌ Problem

Agents generate inconsistent date formats across invocations, leading to parsing...

✓ Solution

Use strict types with explicit formats. Define 'date: string, format: date' for ...

Notion

Building a Type-Safe Tool Layer for AI Features

AI feature accuracy improved from 71% to 94% for content creation tasks. User co...

Tool Validation Pipeline on AWS

Agent Output

JSON Schema Validati...

Business Rule Valida...

Authorization Check ...

Key Insight

Enums Are Your Most Powerful Schema Feature

When a parameter has a finite set of valid values, always use an enum rather than a freeform string with validation. Enums serve double duty: they constrain agent outputs to valid options AND they appear in the tool description, teaching the agent what values are possible.

Enum-Rich Schema with Semantic Descriptionsjson

123456789101112
{
  "type": "object",
  "properties": {
    "action": {
      "type": "string",
      "enum": ["approve", "reject", "escalate", "defer"],
      "description": "Decision action for the request. approve: Grant the request immediately, notify user of success. reject: Deny the request, must provide rejection_reason. escalate: Forward to human reviewer for complex cases or policy exceptions. defer: Postpone decision, requires defer_until date and defer_reason."
    },
    "rejection_reason": {
      "type": "string",
      "enum": ["policy_violation", "insufficient_documentation", "budget_exceeded", "duplicate_request", "other"],
      "description": "Required when action is 'reject'. policy_violation: Request violates company policy (cite specific policy). insufficient_documentation: Missing required attachments or details. budget_exceeded: Requested amount exceeds available budget. duplicate_request: Similar request already exists. other: Requires custom explanation in rejection_notes."

Tool Description Quality Checklist

Use AWS Bedrock's Tool Use Playground for Testing

Before deploying tools to production, test them in AWS Bedrock's tool use playground with various prompts. Try edge cases, ambiguous requests, and scenarios where multiple tools might apply.

Practice Exercise

Design Your First Production-Ready Tool

45 min

Key Insight

Tool Names Are Part of the Interface

Tool naming follows different rules than function naming in traditional code. While developers might use abbreviated names like 'getUserById' or 'createTkt', agent tools benefit from explicit, descriptive names that convey purpose at a glance.

Framework

Tool Description Architecture (TDA)

Purpose Statement

A single, clear sentence explaining the tool's primary function. Avoid technical jargon and focus on...

Capability Boundaries

Explicit statements about what the tool can and cannot do. This prevents the agent from attempting i...

Usage Context

Describe scenarios where this tool is the right choice versus alternatives. Help the agent understan...

Parameter Guidance

Beyond schema validation, provide semantic guidance for each parameter. Explain acceptable value ran...

Production-Grade Tool Definition with TDA Frameworktypescript

123456789101112
const customerLookupTool: AgentTool = {
  name: 'lookup_customer',
  description: `Retrieves detailed customer information from the CRM database.

    PURPOSE: Find customer records by email, phone, or customer ID to answer 
    account-related questions or verify identity.

    CAPABILITIES:
    - Returns: name, email, phone, account status, subscription tier, 
      created date, last activity, support history summary
    - Search by: exact email, phone (any format), or customer ID
    - Data freshness: Real-time, reflects changes within 30 seconds

Intercom

Reducing Tool Selection Errors by 73% Through Description Engineering

Tool selection accuracy improved from 71% to 94%. Average conversation resolutio...

Basic vs. Production Parameter Schemas

Basic Schema (Fragile)

Uses only type and required fields

Generic descriptions like 'The user ID'

No validation beyond type checking

Missing defaults force LLM to guess

Production Schema (Robust)

Full JSON Schema vocabulary with patterns and bounds

Descriptions include format, examples, and edge cases

Semantic validation with format specifiers

Strategic defaults for common cases

Production Parameter Schema with Full Validationjson

123456789101112
{
  "type": "object",
  "properties": {
    "customer_id": {
      "type": "string",
      "pattern": "^cust_[a-zA-Z0-9]{10,20}$",
      "description": "Unique customer identifier. Format: cust_ followed by 10-20 alphanumeric characters. Example: cust_abc123def456"
    },
    "date_range": {
      "type": "object",
      "properties": {
        "start": {

Framework

Error Taxonomy and Recovery Framework (ETRF)

Transient Errors (Retry)

Temporary failures that typically resolve on retry: network timeouts, rate limits, temporary service...

Input Errors (Correct)

Invalid parameters that the agent can potentially fix: malformed dates, invalid IDs, out-of-range va...

State Errors (Adapt)

Valid request but invalid state: resource doesn't exist, action not allowed in current state, stale ...

Permission Errors (Escalate)

Authorization failures that require human intervention: missing permissions, expired tokens, access ...

Implementing ETRF in Tool Executiontypescript

123456789101112
enum ErrorCategory {
  TRANSIENT = 'transient',
  INPUT = 'input',
  STATE = 'state',
  PERMISSION = 'permission',
  SYSTEM = 'system'
}

interface ClassifiedError {
  category: ErrorCategory;
  originalError: Error;
  recoveryHint: string;

Linear

Building Self-Healing Tools for Their AI Assistant

Tool call success rate improved from 85% to 99.2%. Mean time to recovery for tra...

Anti-Pattern: The Catch-All Error Handler

❌ Problem

Agents waste tokens and time on futile retries of permission errors while giving...

✓ Solution

Implement error classification at the tool level, returning structured error res...

Key Insight

Tool Composition: Building Complex Capabilities from Simple Primitives

The most powerful agent tools are often compositions of simpler tools, not monolithic functions. Instead of building a 'create_and_send_invoice' tool, build separate 'create_invoice', 'validate_invoice', and 'send_invoice' tools that the agent can compose.

Tool Composition Architecture

Primitive Tools

Validation Layer

Agent Orchestration

Composite Workflows

Composable Tool Design Patterntypescript

123456789101112
// Primitive tools - single responsibility
const primitiveTools = {
  getCustomer: {
    name: 'get_customer',
    description: 'Retrieve customer record by ID. Returns customer details or null if not found.',
    execute: async (params: { customerId: string }) => {
      return await db.customers.findById(params.customerId);
    }
  },
  
  calculateInvoice: {
    name: 'calculate_invoice',

Tool Testing Checklist for Production Readiness

Notion

Building a Comprehensive Tool Testing Pipeline

Production incidents from tool failures dropped from 4 per month to 0.3 per mont...

94%

of tool-related production incidents are preventable with comprehensive testing

Analysis of 200+ production AI systems found that the vast majority of tool failures stem from untested edge cases, inadequate error handling, or description ambiguities.

Practice Exercise

Build a Production-Grade Tool with Full Error Handling

45 min

Tools Are Your Agent's Hands—Design Them Carefully

Every limitation in your tools becomes a limitation in your agent. Every ambiguity in your tool descriptions becomes confusion in your agent's decisions.

Essential Resources for Tool Design Mastery

OpenAI Function Calling Best Practices

article

Anthropic Tool Use Documentation

article

JSON Schema Specification

article

AWS Step Functions Developer Guide

article

THIS WEEK'S JOURNEY

From Theory to Mastery: Practicing Production Tool Development

Understanding tool design principles is only the beginning—true mastery comes from hands-on practice with realistic scenarios. This section provides structured exercises that simulate real production challenges, from designing tool schemas for complex domains to implementing robust error handling patterns.

Practice Exercise

Design a Complete E-Commerce Tool Suite

90 min

Production-Ready Tool Implementation Patternpython

123456789101112
from typing import Optional, List
from pydantic import BaseModel, Field, validator
from enum import Enum
import structlog
import time

logger = structlog.get_logger()

class ToolError(Exception):
    """Base exception for tool errors with structured context"""
    def __init__(self, message: str, error_code: str, recoverable: bool = True, 
                 suggested_action: Optional[str] = None):

Practice Exercise

Implement Graceful Degradation for External APIs

60 min

Tool Testing Checklist

Anti-Pattern: The Swiss Army Knife Tool

❌ Problem

Anthropic's research shows that tools with more than 5-6 parameters see accuracy...

✓ Solution

Follow the Single Responsibility Principle for tools. Create separate tools for ...

Anti-Pattern: The Silent Failure

❌ Problem

Silent failures compound through agent workflows. An agent that receives an empt...

✓ Solution

Make every response explicit about what happened. Return structured responses wi...

Anti-Pattern: The Undocumented Side Effect

❌ Problem

Agents cannot reason about side effects they don't know exist. An agent might ca...

✓ Solution

Tools should do exactly what their description says—no more, no less. If side ef...

Practice Exercise

Build a Tool Composition Pipeline

120 min

Tool Description Template with Examplesjson

123456789101112
{
  "name": "search_knowledge_base",
  "description": "Searches the company knowledge base for articles, documentation, and FAQs. Returns relevant excerpts with source links. Use this tool when users ask questions about products, policies, or procedures. Results are ranked by relevance and recency.",
  "parameters": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "Natural language search query. Be specific and include key terms. Example: 'return policy electronics 30 days' not just 'returns'",
        "minLength": 3,
        "maxLength": 200
      },

Framework

The CLEAR Tool Design Framework

Constrained

Limit each tool to a single, well-defined operation. If you need 'and' to describe what the tool doe...

Logged

Every tool invocation should generate structured logs with request parameters, response summary, dur...

Explicit

No hidden behaviors, implicit defaults, or silent failures. Every parameter should have a clear defa...

Atomic

Tools should either complete fully or fail cleanly—no partial states. If a tool modifies multiple re...

Essential Tool Development Resources

OpenAPI Specification 3.1

documentation

JSON Schema Validation (ajv)

tool

Pydantic V2 Documentation

documentation

AWS Step Functions Developer Guide

documentation

Practice Exercise

Tool Security Audit Exercise

45 min

Tool Versioning Strategy

Never modify existing tool schemas in ways that break backward compatibility. Instead, create new versions (search_v2) and deprecate old ones gradually.

3.2x

Improvement in agent task completion when tools include usage examples

Including 2-3 example invocations in tool descriptions dramatically improves agent accuracy.

Tool Metrics and Monitoring Setuppython

123456789101112
from prometheus_client import Counter, Histogram, Gauge
import functools
import time

# Define metrics
TOOL_INVOCATIONS = Counter(
    'agent_tool_invocations_total',
    'Total tool invocations',
    ['tool_name', 'status', 'error_code']
)

TOOL_DURATION = Histogram(

Tool Description A/B Testing

Run experiments with different tool descriptions to optimize agent accuracy. Deploy two versions of the same tool with different descriptions and measure which produces better outcomes.

Vercel

Building Self-Documenting Tools for v0

Tool selection accuracy improved from 71% to 94%. User satisfaction scores incre...

Tool Lifecycle Management

Design & Schema

Security Review

Staging Deploy

Shadow Testing

Pre-Production Tool Launch Checklist

Chapter Complete!

Tool descriptions are the primary interface between agents a...

Parameter schemas should be strict and validated at multiple...

Error handling determines whether agents can recover from fa...

Tool composition enables complex workflows while maintaining...

Next: Start by auditing your existing tools against the CLEAR framework—identify which tools violate single responsibility, have unclear descriptions, or lack proper error handling

PreviousNext