Qwen3 Coder Next (São Paulo)

OpenAI-compatible Bedrock Mantle model available in São Paulo. Direct inference without cross-region routing.

Provider: All Qwen models | AWS Bedrock

Inference regions: sa-east-1

API Endpoint

https://bedrock-mantle.sa-east-1.api.aws/v1/responses

Quick Start (Python)

Install: pip install boto3

import boto3

client = boto3.client("bedrock-runtime", region_name="sa-east-1")

response = client.converse(
    modelId="qwen.qwen3-coder-next",
    messages=[
        {
            "role": "user",
            "content": [{"text": "Hello, how are you?"}],
        }
    ],
    inferenceConfig={
        "maxTokens": 1024,
        "temperature": 0.7,
    },
)

print(response["output"]["message"]["content"][0]["text"])

Additional examples: Basic invoke, Streaming

Supported Parameters

ParameterTypeDescription
max_output_tokensintegerMaximum number of visible output tokens to generate. (≥1)
streambooleanStream response events as they are generated. Default: false.
storebooleanStore response state for follow-up turns. Set false for zero-retention request handling. Default: false.

Feature Guides

OpenAI-compatible Responses API

Use the OpenAI SDK with a Bedrock API key and the regional bedrock-mantle endpoint.

Documentation

Developer Notes