@cf/meta/llama-4-scout-17b-16e-instruct — Free AI Model & API
cloudflare-workers-ai/cf-meta-llama-4-scout-17b-16e-instruct Overview
Llama 4 Scout 17B is Meta's latest-generation small-footprint model with an exceptionally large 10M-token context window — meaning you can ingest entire codebases, multi-hour transcripts, or thousand-page documents in a single request. Running on Cloudflare's global edge network, it combines the efficiency of a 17B-parameter model (with 16 active experts via MoE) with a context size that exceeds most flagship models. On the free tier it shares the 10,000 Neurons/day pool and uses Cloudflare's native API. For developers evaluating long-context architectures or building retrieval-free RAG alternatives, this is one of the few free options that can genuinely test 10M-token prompts.
Quick Start
Integrate @cf/meta/llama-4-scout-17b-16e-instruct with 3 lines of code. See the config generator for Claude Code, Cursor, and more.
Other Free Models from Cloudflare Workers AI
@cf/meta/llama-3.3-70b-instruct-fp8-fast
131K context · No card
@cf/meta/llama-3.1-8b-instruct-fp8-fast
131K context · No card
@cf/meta/llama-3.2-11b-vision-instruct
131K context · No card
@cf/mistralai/mistral-small-3.1-24b-instruct
128K context · No card
@cf/google/gemma-4-26b-a4b-it
256K context · No card
Rate Limits & Constraints
Cloudflare Workers AI Platform Limitations
- Neurons billing is opaque — hard to predict exact request counts
- Model availability varies by Cloudflare region
- 10,000 Neurons/day shared across all models
Features & Use Cases
Best For
Modality Support
Cloudflare Workers AI Highlights
- 50+ models on the free tier
- 10,000 Neurons/day
- Global edge network for low latency
- Text, image, audio, and embedding models
Playground — Test @cf/meta/llama-4-scout-17b-16e-instruct
Test @cf/meta/llama-4-scout-17b-16e-instruct directly in your browser. Your API key is sent directly to Cloudflare Workers AI — never stored.
🔒 Your key is never stored — sent directly to the model provider via our server proxy.
Ready to chat with @cf/meta/llama-4-scout-17b-16e-instruct.
Frequently Asked Questions
How do I get an API key for @cf/meta/llama-4-scout-17b-16e-instruct?
Sign up at Cloudflare Workers AI to get your API key. No credit card is required — just an email sign-up. Once you have the key, use the code snippets in the Quick Start section above.
Is @cf/meta/llama-4-scout-17b-16e-instruct really free?
Yes. @cf/meta/llama-4-scout-17b-16e-instruct is available on Cloudflare Workers AI's free tier and has been free since May 10, 2026. Rate limits apply: 10K neurons/day (shared). Always check the provider's terms for any changes to the free tier.
What are @cf/meta/llama-4-scout-17b-16e-instruct's rate limits?
10K neurons/day (shared) Context window: 10.0M. Max output: 131K. No credit card required.
What are the best free alternatives to @cf/meta/llama-4-scout-17b-16e-instruct?
Popular free alternatives include inclusionAI: Ring-2.6-1T, Baidu Qianfan: CoBuddy (free), Owl Alpha. You can also browse all 147+ free models on our site.
More questions? See our full FAQ →