llama-4-maverick-17b-128e-instruct — Free AI Model & API
groq/llama-4-maverick-17b-128e-instruct Overview
Llama 4 Maverick 17B on Groq is Meta's highest-expert-count MoE model with 128 active experts, running on Groq's LPU hardware for fast inference. The large expert count gives it broader knowledge and stronger instruction-following than the Scout variant, making it the better Groq option for complex tasks. Rate limits are notably tighter than Groq's other endpoints at 15 RPM and 500 requests per day, so it is best reserved for evaluations and high-value queries rather than high-volume traffic. OpenAI SDK compatible; registration required, no credit card.
Quick Start
Integrate llama-4-maverick-17b-128e-instruct with 3 lines of code. See the config generator for Claude Code, Cursor, and more.
Other Free Models from Groq
Rate Limits & Constraints
Groq Platform Limitations
- Rate limits vary significantly by model — check per-model limits
- Some models have token-per-minute caps in addition to RPM
- LPU availability may cause queuing during peak usage
Features & Use Cases
Best For
Modality Support
Groq Highlights
- Ultra-fast inference (~2,600 tok/s)
- Free tier: 14,400 RPD for most models
- Supports Llama 4, Qwen3, DeepSeek-R1
- OpenAI-compatible
Playground — Test llama-4-maverick-17b-128e-instruct
Test llama-4-maverick-17b-128e-instruct directly in your browser. Your API key is sent directly to Groq — never stored.
🔒 Your key is never stored — sent directly to the model provider via our server proxy.
Ready to chat with llama-4-maverick-17b-128e-instruct.
Frequently Asked Questions
How do I get an API key for llama-4-maverick-17b-128e-instruct?
Sign up at Groq to get your API key. No credit card is required — just an email sign-up. Once you have the key, use the code snippets in the Quick Start section above.
Is llama-4-maverick-17b-128e-instruct really free?
Yes. llama-4-maverick-17b-128e-instruct is available on Groq's free tier and has been free since May 10, 2026. Rate limits apply: 15 RPM, 500 RPD. Always check the provider's terms for any changes to the free tier.
What are llama-4-maverick-17b-128e-instruct's rate limits?
15 RPM, 500 RPD Context window: 131K. Max output: 8K. No credit card required.
What are the best free alternatives to llama-4-maverick-17b-128e-instruct?
Popular free alternatives include inclusionAI: Ring-2.6-1T, Baidu Qianfan: CoBuddy (free), Owl Alpha. You can also browse all 147+ free models on our site.
More questions? See our full FAQ →