Skip to main content
Available on all Portkey plans.
The Messages API is Anthropic’s native format for interacting with Claude models. Portkey extends it to work with all providers — use the Anthropic SDK pointed at Portkey’s base URL, and switch between providers by changing the model string.

Quick Start

Use the Anthropic SDK with Portkey’s base URL. The @provider-slug/model format routes requests to the correct provider.
import anthropic

client = anthropic.Anthropic(
    api_key="PORTKEY_API_KEY",
    base_url="https://api.portkey.ai"
)

message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Explain quantum computing in simple terms"}]
)

print(message.content[0].text)
max_tokens is required for the Messages API. Switch model to use any provider — @openai-provider/gpt-4o, @google-provider/gemini-2.0-flash, etc.

How It Works

Portkey receives Messages API requests and translates them to each provider’s native format:
  • Anthropic — requests pass through directly
  • All other providers — Portkey’s adapter translates between Messages format and the provider’s native format
The response always comes back in Anthropic Messages format, regardless of which provider handles the request.

System Prompt

Set a system prompt with the top-level system parameter (not inside messages):
message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    system="You are a pirate. Always respond in pirate speak.",
    messages=[{"role": "user", "content": "Say hello."}]
)
The system parameter also accepts an array of content blocks for prompt caching:
Python
message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    system=[
        {"type": "text", "text": "You are an expert on this topic..."},
        {"type": "text", "text": "Here is the reference material...", "cache_control": {"type": "ephemeral"}}
    ],
    messages=[{"role": "user", "content": "Summarize the key points"}]
)

Streaming

Stream responses with stream=True in the SDK, or the stream parameter in cURL.
with client.messages.stream(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Write a haiku about AI"}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Tool Use

Define tools with name, description, and input_schema (note: different from Chat Completions’ parameters):
message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "What's the weather in San Francisco?"}],
    tools=[{
        "name": "get_weather",
        "description": "Get current weather for a location",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"]
        }
    }]
)

for block in message.content:
    if block.type == "tool_use":
        print(f"Tool: {block.name}, Input: {block.input}")

Tool Results

Pass tool results back to continue the conversation. Tool results go in a user message with tool_result content blocks:
message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "What's the weather in Paris?"},
        {"role": "assistant", "content": [
            {"type": "tool_use", "id": "tool_123", "name": "get_weather", "input": {"location": "Paris"}}
        ]},
        {"role": "user", "content": [
            {"type": "tool_result", "tool_use_id": "tool_123", "content": '{"temp": "22°C", "condition": "sunny"}'}
        ]}
    ],
    tools=[{
        "name": "get_weather",
        "description": "Get weather for a location",
        "input_schema": {"type": "object", "properties": {"location": {"type": "string"}}, "required": ["location"]}
    }]
)

print(message.content[0].text)

Vision

Send images using content blocks. Supports both URLs and base64-encoded data.
# From URL
message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": [
            {"type": "image", "source": {"type": "url", "url": "https://example.com/image.jpg"}},
            {"type": "text", "text": "Describe this image"}
        ]
    }]
)

print(message.content[0].text)

Extended Thinking

Enable extended thinking for complex reasoning tasks. Requires max_tokens greater than budget_tokens.
message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=16000,
    thinking={"type": "enabled", "budget_tokens": 10000},
    messages=[{"role": "user", "content": "Analyze the implications of quantum computing on cryptography"}]
)

for block in message.content:
    if block.type == "thinking":
        print(f"Thinking: {block.thinking[:200]}...")
    elif block.type == "text":
        print(f"Response: {block.text}")
Extended thinking output counts toward max_tokens. Set max_tokens high enough to accommodate both thinking and the final response.

Prompt Caching

Use cache_control on system prompts, messages, and tool definitions to cache frequently-used content.
message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    system=[{
        "type": "text",
        "text": "You are an expert analyst. Here is a very long reference document...",
        "cache_control": {"type": "ephemeral"}
    }],
    messages=[{"role": "user", "content": "Summarize the key points"}]
)
Cached content is reused across requests, reducing latency and costs. Cache usage is reflected in the response usage object.

Multi-turn Conversations

Build conversations by passing the full message history. Messages must alternate between user and assistant roles.
message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "My name is Alice."},
        {"role": "assistant", "content": "Hello Alice! How can I help you?"},
        {"role": "user", "content": "What is my name?"}
    ]
)

print(message.content[0].text)  # "Your name is Alice."

Using with Portkey Features

The Messages API works with all Portkey gateway features. Pass Portkey-specific headers alongside the Anthropic request:
import anthropic

client = anthropic.Anthropic(
    api_key="PORTKEY_API_KEY",
    base_url="https://api.portkey.ai",
    default_headers={
        "x-portkey-config": "pp-config-xxx"  # Config with fallbacks, load balancing, etc.
    }
)

message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
  • Configs — Route, load balance, and set fallbacks
  • Caching — Cache responses for faster, cheaper calls
  • Guardrails — Add input/output guardrails
  • Observability — Full logging and tracing

API Reference

Last modified on February 12, 2026