Fireworks Kimi K2.5 (Global)

Kimi K2.5 is Moonshot AI's flagship agentic model and a new SOTA open model. It unifies vision and text, thinking and non-thinking modes, and single-agent and multi-agent execution into one model. Fireworks enables users to control the reasoning behavior of the Kimi K2.5 model and inspect its reasoning history for greater transparency.

Provider: All Moonshot models | Fireworks AI

API Endpoint

https://api.fireworks.ai/inference/v1/chat/completions

Quick Start (Python)

Install: pip install openai

from openai import OpenAI

client = OpenAI(
    api_key="your-fireworks-api-key",
    base_url="https://api.fireworks.ai/inference/v1",
)

response = client.chat.completions.create(
    model="accounts/fireworks/models/kimi-k2p5",
    messages=[
        {"role": "user", "content": "Hello, how are you?"},
    ],
    max_tokens=1024,
    temperature=0.7,
)

print(response.choices[0].message.content)

Additional examples: Basic invoke, Streaming

Supported Parameters

ParameterTypeDescription
max_tokensintegerMaximum tokens to generate. (≥1)
temperaturefloatControls randomness. (0–2) Default: 0.7.
top_pfloatNucleus sampling threshold. (0–1) Default: 1.
streambooleanStream response chunks as they are generated. Default: false.
stopstringStop sequence or array of stop sequences.

Feature Guides

Serverless Inference

Pay per token for public open models without managing GPU deployments.

Documentation

OpenAI Compatibility

Use OpenAI-compatible client libraries by changing the API base URL.

Documentation