@cf/meta/llama-3.3-70b-instruct-fp8-fast — Free AI Model & API

cloudflare-workers-ai/cf-meta-llama-3-3-70b-instruct-fp8-fast

chat

Get API key →

Context Window 131K

Max Output 131K

Rate Limit 10K neurons/day (shared)

Cost $0.00 FREE

Free Period Since May 10, 2026

Credit Card Not required

Status Online

Overview

Llama 3.3 70B Instruct runs on Cloudflare Workers AI's global edge network, bringing Meta's 70B-parameter flagship to every Cloudflare data center worldwide. Deployed at the edge, it offers significantly lower latency than centralized API providers — requests are routed to the nearest Cloudflare PoP rather than a single-region GPU cluster. The free tier allocates 10,000 Neurons (compute units) per day across all models on your account, so available capacity depends on your other Workers AI usage. The API uses a Cloudflare-specific REST format rather than OpenAI SDK compatibility, and requires a Cloudflare account ID in the endpoint URL — lightweight setup for existing Cloudflare users, but an extra step for everyone else.

Model ID

cf-meta-llama-3-3-70b-instruct-fp8-fast

Base URL

https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run

Specifications

Context: 131K · Output: 131K · Modality: text · OpenAI Compat: No

Quick Start

Integrate @cf/meta/llama-3.3-70b-instruct-fp8-fast with 3 lines of code. See the config generator for Claude Code, Cursor, and more.

from openai import OpenAI

client = OpenAI(
 base_url="https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run",
 api_key="YOUR_API_KEY"
)

response = client.chat.completions.create(
 model="cf-meta-llama-3-3-70b-instruct-fp8-fast",
 messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

import OpenAI from "openai";

const openai = new OpenAI({
 baseURL: "https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run",
 apiKey: "YOUR_API_KEY",
});

const completion = await openai.chat.completions.create({
 model: "cf-meta-llama-3-3-70b-instruct-fp8-fast",
 messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);

curl "https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run/models/cf-meta-llama-3-3-70b-instruct-fp8-fast:generateContent?key=YOUR_API_KEY" \
 -H "Content-Type: application/json" \
 -d '{
 "contents": [{"parts": [{"text": "Hello!"}]}]
 }'

Other Free Models from Cloudflare Workers AI

@cf/meta/llama-3.1-8b-instruct-fp8-fast

131K context · No card

@cf/meta/llama-3.2-11b-vision-instruct

131K context · No card

@cf/meta/llama-4-scout-17b-16e-instruct

10.0M context · No card

@cf/mistralai/mistral-small-3.1-24b-instruct

128K context · No card

@cf/google/gemma-4-26b-a4b-it

256K context · No card

Rate Limits & Constraints

Rate Limit 10K neurons/day (shared)

Context Window 131K

Max Output Tokens 131K

Cost Free — since May 10, 2026

Credit Card Not required

OpenAI Compatible No — uses provider-native API

Cloudflare Workers AI Platform Limitations

Neurons billing is opaque — hard to predict exact request counts
Model availability varies by Cloudflare region
10,000 Neurons/day shared across all models

Features & Use Cases

Best For

Chat

Modality Support

text

Cloudflare Workers AI Highlights

50+ models on the free tier
10,000 Neurons/day
Global edge network for low latency
Text, image, audio, and embedding models

Playground — Test @cf/meta/llama-3.3-70b-instruct-fp8-fast

Test @cf/meta/llama-3.3-70b-instruct-fp8-fast directly in your browser. Your API key is sent directly to Cloudflare Workers AI — never stored.

Model: @cf/meta/llama-3.3-70b-instruct-fp8-fast Get Key

🔒 Your key is never stored — sent directly to the model provider via our server proxy.

Ready to chat with @cf/meta/llama-3.3-70b-instruct-fp8-fast.

Frequently Asked Questions

How do I get an API key for @cf/meta/llama-3.3-70b-instruct-fp8-fast?

Sign up at Cloudflare Workers AI to get your API key. No credit card is required — just an email sign-up. Once you have the key, use the code snippets in the Quick Start section above.

Is @cf/meta/llama-3.3-70b-instruct-fp8-fast really free?

Yes. @cf/meta/llama-3.3-70b-instruct-fp8-fast is available on Cloudflare Workers AI's free tier and has been free since May 10, 2026. Rate limits apply: 10K neurons/day (shared). Always check the provider's terms for any changes to the free tier.

What are @cf/meta/llama-3.3-70b-instruct-fp8-fast's rate limits?

10K neurons/day (shared) Context window: 131K. Max output: 131K. No credit card required.

What are the best free alternatives to @cf/meta/llama-3.3-70b-instruct-fp8-fast?

Popular free alternatives include inclusionAI: Ring-2.6-1T, Baidu Qianfan: CoBuddy (free), Owl Alpha. You can also browse all 147+ free models on our site.