Titan Text Embeddings V2 (Ohio)

Foundation model available in Ohio. Direct inference without cross-region routing.

Provider: All Amazon models | AWS Bedrock

Inference regions: us-east-2

API Endpoint

https://bedrock-runtime.us-east-2.amazonaws.com

Quick Start (Python)

Install: pip install boto3

import boto3

client = boto3.client("bedrock-runtime", region_name="us-east-2")

response = client.converse(
    modelId="amazon.titan-embed-text-v2",
    messages=[
        {
            "role": "user",
            "content": [{"text": "Hello, how are you?"}],
        }
    ],
    inferenceConfig={
        "maxTokens": 1024,
        "temperature": 0.7,
    },
)

print(response["output"]["message"]["content"][0]["text"])

Additional examples: Basic invoke, Streaming

Supported Parameters

Parameter	Type	Description
max_tokens	integer	Maximum number of tokens to generate in the response. (≥1)
temperature	float	Controls randomness. Lower values make output more deterministic. (0–1) Default: 1.
top_p	float	Nucleus sampling threshold. Considers tokens with cumulative probability up to this value. (0–1) Default: 1.
stop_sequences	string	Up to 4 sequences where the model will stop generating.
top_k	integer	Only sample from the top K most likely tokens at each step. (0–500) Default: 250.

Feature Guides

On-Demand Inference

Default mode. Pay per token with no upfront commitment.

Documentation