How to Get a Free Cerebras API Key (2026)

5 free models available — no credit card required. Get your Cerebras API key →

Overview

Ultra-fast inference on Cerebras WSE chips — 1M tokens/day.

Cerebras Cloud offers free API access to Llama and GPT-OSS models running on the Cerebras Wafer-Scale Engine, one of the fastest AI accelerators available. The free tier provides 1 million tokens/day and 14,400 requests/day per model with no credit card required. Context window is limited to 8K on the free tier.

Ultra-fast inference on WSE chips
1M tokens/day free
No credit card required
Llama 3.1 8B + GPT-OSS 120B available

API Compatibility: OpenAI SDK-compatible (Chat Completions)

Quick Start Guide

1
Sign up at cloud.cerebras.ai Email or GitHub. No credit card.
2
Go to API Keys
3
Generate an API key
4
Choose a model Llama 3.3 70B or GPT-OSS 120B available for free.
5
Configure OpenAI client Base URL: https://api.cerebras.ai/v1

All Free Cerebras Models — Context Windows & Rate Limits

Model	Context	Max Output	Modality	Rate Limit	Released	Status
llama3.1-8b	128K	8K	text	30 RPM, 14,400 RPD, 1M TPD	—	Online	Details
gpt-oss-120b	128K	8K	text	30 RPM, 14,400 RPD, 1M TPD	—	Online	Details
qwen-3-235b-a22b-instruct-2507	131K	8K	text	30 RPM, 14,400 RPD, 1M TPD	—	Online	Details
zai-glm-4.7	128K	8K	text	10 RPM, 100 RPD, 1M TPD	—	Online	Details
Llama 3.1 70B	131K	0	text	See provider page	—	Online	Details