Best Free LLM APIs for Reasoning

24 free models available for reasoning. How to choose a free LLM for reasoning →

Coding Chat Vision Audio Reasoning Embedding

For complex reasoning, math, and logic problems, look for models that support chain-of-thought (CoT) and score well on benchmarks like MATH and GPQA. DeepSeek-R1 — available for free via Groq, NVIDIA NIM, and OpenRouter — is a dedicated reasoning model with visible CoT. Gemini 2.5 Flash and Qwen3 also perform well on reasoning tasks.

What to Look for in a Reasoning Model

Reasoning models are a distinct category optimized for multi-step thinking:

Chain-of-Thought (CoT) — The model shows its work step by step before giving a final answer. This dramatically improves accuracy on math, logic puzzles, and complex problem-solving. DeepSeek-R1 pioneered visible CoT in the open-source world; Qwen3 and Nemotron also support it.
Test-time compute scaling — Reasoning models can "think longer" for harder problems, using more tokens during inference. This means the same model can be fast on simple questions and thorough on hard ones. DeepSeek-R1 and Qwen3.5 support this.
Benchmark performance — Look at MATH (high school competition math), GPQA (graduate-level science), and AIME (advanced math). DeepSeek-R1 and Qwen3.5 lead the free tier on these benchmarks.
Context window for reasoning — Reasoning models need context for multi-step problems with long intermediate work. CoT can easily consume 10K+ tokens of "thinking" before reaching the answer. Look for 32K+ context for non-trivial problems.
Code + reasoning crossover — Many reasoning tasks benefit from code execution (e.g., "write a Python script to verify this proof"). Models with both coding and reasoning ability (Qwen3, DeepSeek V4) are more versatile.

How to Choose a Free Reasoning Model

Reasoning model selection depends on problem complexity:

Math competition / Olympiad problems? → DeepSeek-R1 via Groq or NVIDIA NIM. Dedicated reasoning with visible CoT. For the hardest problems, let it "think" for 10K+ tokens.
Logical reasoning / puzzles? → Qwen3.5 397B (via NVIDIA NIM) or Gemini 2.5 Flash. Both handle logical deduction and multi-step reasoning well.
Code debugging that requires deep reasoning? → DeepSeek V4 (via OpenRouter or NVIDIA NIM). Combines reasoning with strong coding ability — can reason about code behavior and identify subtle bugs.
Scientific / research reasoning? → Qwen3.5 397B for breadth of knowledge (397B total parameters via MoE). DeepSeek-R1 for pure reasoning depth.
Quick reasoning vs deep thinking? → Most models let you control thinking depth. Gemini 2.5 Flash is fast for straightforward questions. DeepSeek-R1 is better when you can wait for deeper analysis.

Top Picks for Reasoning

DeepSeek: DeepSeek R1 (free) OpenRouter

Dedicated reasoning model with visible chain-of-thought. Available via Groq, NVIDIA NIM, and OpenRouter.

Qwen: Qwen3.5 397B A17B NVIDIA NIM

Massive 397B MoE model, strong on MATH and GPQA benchmarks. 40 RPM, no daily cap.

Google: Gemini 2.5 Flash Google

1M context for long reasoning chains. Good balance of speed and depth.

NVIDIA: Nemotron 3 Super (free) OpenRouter

NVIDIA's own reasoning model. 262K context, strong math and logic performance.

All Free Reasoning Models

Provider	Model	Context	Max Output	Modality	Rate Limit	Released
OpenRouter	NVIDIA: Nemotron 3 Nano Omni (free)	256K	66K	textimageaudio	See provider page	Apr 28, 2026	Details
OpenRouter	Arcee AI: Trinity Large Thinking (free)	262K	80K	textreasoning	See provider page	Apr 1, 2026	Details
OpenRouter	NVIDIA: Nemotron 3 Super (free)	1.0M	262K	text	See provider page	Mar 11, 2026	Details
OpenRouter	LiquidAI: LFM2.5-1.2B-Thinking (free)	33K	8K	textreasoning	See provider page	Jan 20, 2026	Details
OpenRouter	NVIDIA: Nemotron 3 Nano 30B A3B (free)	256K	8K	text	See provider page	Dec 14, 2025	Details
OpenRouter	NVIDIA: Nemotron Nano 12B 2 VL (free)	128K	128K	textimage	See provider page	Oct 28, 2025	Details
OpenRouter	NVIDIA: Nemotron Nano 9B V2 (free)	128K	8K	text	See provider page	Sep 5, 2025	Details
Cloudflare Workers AI	@cf/deepseek-ai/deepseek-r1-distill-qwen-32b	32K	131K	text	10K neurons/day (shared)	—	Details
GitHub Models	DeepSeek-R1	64K	8K	text	15 RPM, 150 RPD	—	Details
Groq	deepseek-r1-distill-70b	131K	8K	text	30 RPM, 14,400 RPD	—	Details
Kilo Code	nvidia/nemotron-3-super-120b-a12b:free	262K	32K	text	~200 req/hr	—	Details
Kilo Code	arcee-ai/trinity-large-thinking:free	131K	131K	text	~200 req/hr	—	Details
LLM7.io	deepseek-r1-0528	131K	131K	text	30 RPM (120 with token)	—	Details
Ollama Cloud	deepseek-r1:cloud	128K	131K	text	Session/weekly limits (unpublished)	—	Details
OVHcloud AI Endpoints	DeepSeek-R1-Distill-Llama-70B	131K	32K	text	2 RPM (anonymous)	—	Details
SiliconFlow	deepseek-ai/DeepSeek-R1-0528-Qwen3-8B	33K	16K	text	1,000 RPM, 50K TPM	—	Details
SiliconFlow	deepseek-ai/DeepSeek-R1-Distill-Qwen-7B	131K	131K	text	1,000 RPM, 50K TPM	—	Details
SiliconFlow	THUDM/GLM-4.1V-9B-Thinking	66K	66K	text	1,000 RPM, 50K TPM	—	Details
NVIDIA NIM	nvidia/llama-3.1-nemotron-ultra-253b-v1	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/llama-3.3-nemotron-super-49b-v1.5	131K	16K	text	Up to 40 RPM	Oct 10, 2025	Details
OpenRouter	NVIDIA: Llama Nemotron Embed VL 1B V2 (free)	131K	8K	textimageembeddings	See provider page	Feb 25, 2026	Details
Chutes.ai	DeepSeek-R1	131K	0	text	Community-powered, no hard cap	—	Details
Grok (xAI)	Grok-2	131K	0	text	$25/month free credits, resets monthly	—	Details
GitHub Models	Phi-4	131K	0	text	See provider page	—	Details

See our FAQ for common questions about free LLM APIs