Claude Opus 4 6

Routes inference requests to Anthropic models across all commercial AWS regions worldwide

Provider: All Anthropic models | AWS Bedrock

Inference regions: af-south-1, ap-northeast-1, ap-northeast-2, ap-northeast-3, ap-south-1, ap-southeast-1, ap-southeast-2, ap-southeast-4, ca-central-1, eu-central-1, eu-north-1, eu-west-1, eu-west-2, eu-west-3, sa-east-1, us-east-1, us-east-2, us-west-1, us-west-2

API Endpoint

https://bedrock-runtime.us-east-1.amazonaws.com

Quick Start (Python)

Install: pip install boto3

import boto3

client = boto3.client("bedrock-runtime", region_name="us-east-1")

response = client.converse(
    modelId="global.anthropic.claude-opus-4-6-v1",
    messages=[
        {
            "role": "user",
            "content": [{"text": "Hello, how are you?"}],
        }
    ],
    inferenceConfig={
        "maxTokens": 1024,
        "temperature": 0.7,
    },
)

print(response["output"]["message"]["content"][0]["text"])

Additional examples: Basic invoke, Streaming, Extended Thinking

Supported Parameters

ParameterTypeDescription
max_tokensintegerMaximum number of tokens to generate in the response. (≥1)
temperaturefloatControls randomness. Lower values make output more deterministic. (0–1) Default: 1.
top_pfloatNucleus sampling threshold. Considers tokens with cumulative probability up to this value. (0–1) Default: 1.
stop_sequencesstringUp to 4 sequences where the model will stop generating.
top_kintegerOnly sample from the top K most likely tokens at each step. (0–500) Default: 250.
thinking.typeenumEnable or disable extended thinking. Default: disabled.
thinking.budget_tokensintegerToken budget for extended thinking. (1024–128000)

Feature Guides

On-Demand Inference

Default mode. Pay per token with no upfront commitment.

Documentation

Extended Thinking

Enables chain-of-thought reasoning before generating the final response.

  1. Set thinking.type to "enabled" in the request body
  2. Set thinking.budget_tokens to allocate reasoning tokens (1024-128000)

Documentation