Llama 3.3 70B Instruct fp8 Fast

Models

Model details

Model
@cf/meta/llama-3.3-70b-instruct-fp8-fast
Provider
cloudflare-workers-ai
API
openai-completions
Base URL
https://api.cloudflare.com/client/v4/accounts/{CLOUDFLARE_ACCOUNT_ID}/ai/v1
Input
text
Reasoning
No
Context window
24,000
Max tokens
24,000
Cost / million input
$0.293
Cost / million output
$2.253
Cost / million cache read
$0
Cost / million cache write
$0
Model config JSON
{
  "providers": {
    "cloudflare-workers-ai": {
      "apiKey": "YOUR_API_KEY",
      "models": [
        {
          "id": "@cf/meta/llama-3.3-70b-instruct-fp8-fast",
          "name": "Llama 3.3 70B Instruct fp8 Fast",
          "reasoning": false,
          "input": [
            "text"
          ],
          "contextWindow": 24000,
          "maxTokens": 24000,
          "cost": {
            "input": 0.293,
            "output": 2.253,
            "cacheRead": 0,
            "cacheWrite": 0
          },
          "compat": {
            "sendSessionAffinityHeaders": true
          }
        }
      ],
      "api": "openai-completions",
      "baseUrl": "https://api.cloudflare.com/client/v4/accounts/{CLOUDFLARE_ACCOUNT_ID}/ai/v1"
    }
  }
}