llama-4-scout-17b-16e-instruct — Free AI Model & API
groq/llama-4-scout-17b-16e-instruct Overview
Llama 4 Scout 17B on Groq runs Meta's latest MoE generation model with Groq's ultra-fast LPU inference. The Scout variant uses 16 active experts to deliver broad capability in a compact 17B active footprint, with 8K output per request. Combined with Groq's sub-200ms time-to-first-token, it offers a responsive experience for interactive chat and agent workflows. Rate limits are 14,400 requests per day at 30 RPM — sufficient for sustained prototyping and light production use. OpenAI SDK compatible; registration required but no credit card needed.
Quick Start
Integrate llama-4-scout-17b-16e-instruct with 3 lines of code. See the config generator for Claude Code, Cursor, and more.
Other Free Models from Groq
Rate Limits & Constraints
Groq Platform Limitations
- Rate limits vary significantly by model — check per-model limits
- Some models have token-per-minute caps in addition to RPM
- LPU availability may cause queuing during peak usage
Features & Use Cases
Best For
Modality Support
Groq Highlights
- Ultra-fast inference (~2,600 tok/s)
- Free tier: 14,400 RPD for most models
- Supports Llama 4, Qwen3, DeepSeek-R1
- OpenAI-compatible
Playground — Test llama-4-scout-17b-16e-instruct
Test llama-4-scout-17b-16e-instruct directly in your browser. Your API key is sent directly to Groq — never stored.
🔒 Your key is never stored — sent directly to the model provider via our server proxy.
Ready to chat with llama-4-scout-17b-16e-instruct.
Frequently Asked Questions
How do I get an API key for llama-4-scout-17b-16e-instruct?
Sign up at Groq to get your API key. No credit card is required — just an email sign-up. Once you have the key, use the code snippets in the Quick Start section above.
Is llama-4-scout-17b-16e-instruct really free?
Yes. llama-4-scout-17b-16e-instruct is available on Groq's free tier and has been free since May 10, 2026. Rate limits apply: 30 RPM, 14,400 RPD. Always check the provider's terms for any changes to the free tier.
What are llama-4-scout-17b-16e-instruct's rate limits?
30 RPM, 14,400 RPD Context window: 131K. Max output: 8K. No credit card required.
What are the best free alternatives to llama-4-scout-17b-16e-instruct?
Popular free alternatives include inclusionAI: Ring-2.6-1T, Baidu Qianfan: CoBuddy (free), Owl Alpha. You can also browse all 147+ free models on our site.
More questions? See our full FAQ →