Fireworks Qwen3 Embedding 8B (Global)

The Qwen3 Embedding 8B model is the latest proprietary model of the Qwen family, specifically designed for text embedding tasks. This model inherits the exceptional multilingual capabilities, long-text understanding, and reasoning skills building upon the dense foundational models of the Qwen3 series. The model represents significant advancements in multiple text embedding tasks including text retrieval, code retrieval, text classification, text clustering.

Provider: All Qwen models | Fireworks AI

API Endpoint

https://api.fireworks.ai/inference/v1/chat/completions

Supported Parameters

ParameterTypeDescription
max_tokensintegerMaximum tokens to generate. (≥1)
temperaturefloatControls randomness. (0–2) Default: 0.7.
top_pfloatNucleus sampling threshold. (0–1) Default: 1.
streambooleanStream response chunks as they are generated. Default: false.
stopstringStop sequence or array of stop sequences.

Feature Guides

Serverless Inference

Pay per token for public open models without managing GPU deployments.

Documentation

OpenAI Compatibility

Use OpenAI-compatible client libraries by changing the API base URL.

Documentation