Best Free LLM APIs for Reasoning

24 free models available for reasoning. How to choose a free LLM for reasoning →

For complex reasoning, math, and logic problems, look for models that support chain-of-thought (CoT) and score well on benchmarks like MATH and GPQA. DeepSeek-R1 — available for free via Groq, NVIDIA NIM, and OpenRouter — is a dedicated reasoning model with visible CoT. Gemini 2.5 Flash and Qwen3 also perform well on reasoning tasks.

What to Look for in a Reasoning Model

Reasoning models are a distinct category optimized for multi-step thinking:

  • Chain-of-Thought (CoT) — The model shows its work step by step before giving a final answer. This dramatically improves accuracy on math, logic puzzles, and complex problem-solving. DeepSeek-R1 pioneered visible CoT in the open-source world; Qwen3 and Nemotron also support it.
  • Test-time compute scaling — Reasoning models can "think longer" for harder problems, using more tokens during inference. This means the same model can be fast on simple questions and thorough on hard ones. DeepSeek-R1 and Qwen3.5 support this.
  • Benchmark performance — Look at MATH (high school competition math), GPQA (graduate-level science), and AIME (advanced math). DeepSeek-R1 and Qwen3.5 lead the free tier on these benchmarks.
  • Context window for reasoning — Reasoning models need context for multi-step problems with long intermediate work. CoT can easily consume 10K+ tokens of "thinking" before reaching the answer. Look for 32K+ context for non-trivial problems.
  • Code + reasoning crossover — Many reasoning tasks benefit from code execution (e.g., "write a Python script to verify this proof"). Models with both coding and reasoning ability (Qwen3, DeepSeek V4) are more versatile.

How to Choose a Free Reasoning Model

Reasoning model selection depends on problem complexity:

  • Math competition / Olympiad problems? → DeepSeek-R1 via Groq or NVIDIA NIM. Dedicated reasoning with visible CoT. For the hardest problems, let it "think" for 10K+ tokens.
  • Logical reasoning / puzzles? → Qwen3.5 397B (via NVIDIA NIM) or Gemini 2.5 Flash. Both handle logical deduction and multi-step reasoning well.
  • Code debugging that requires deep reasoning? → DeepSeek V4 (via OpenRouter or NVIDIA NIM). Combines reasoning with strong coding ability — can reason about code behavior and identify subtle bugs.
  • Scientific / research reasoning? → Qwen3.5 397B for breadth of knowledge (397B total parameters via MoE). DeepSeek-R1 for pure reasoning depth.
  • Quick reasoning vs deep thinking? → Most models let you control thinking depth. Gemini 2.5 Flash is fast for straightforward questions. DeepSeek-R1 is better when you can wait for deeper analysis.

Top Picks for Reasoning

All Free Reasoning Models

Provider Model Context Max Output Modality Rate Limit Released
OpenRouter NVIDIA: Nemotron 3 Nano Omni (free) 256K 66K textimageaudio See provider page Apr 28, 2026 Details
OpenRouter Arcee AI: Trinity Large Thinking (free) 262K 80K textreasoning See provider page Apr 1, 2026 Details
OpenRouter NVIDIA: Nemotron 3 Super (free) 1.0M 262K text See provider page Mar 11, 2026 Details
OpenRouter LiquidAI: LFM2.5-1.2B-Thinking (free) 33K 8K textreasoning See provider page Jan 20, 2026 Details
OpenRouter NVIDIA: Nemotron 3 Nano 30B A3B (free) 256K 8K text See provider page Dec 14, 2025 Details
OpenRouter NVIDIA: Nemotron Nano 12B 2 VL (free) 128K 128K textimage See provider page Oct 28, 2025 Details
OpenRouter NVIDIA: Nemotron Nano 9B V2 (free) 128K 8K text See provider page Sep 5, 2025 Details
Cloudflare Workers AI @cf/deepseek-ai/deepseek-r1-distill-qwen-32b 32K 131K text 10K neurons/day (shared) Details
GitHub Models DeepSeek-R1 64K 8K text 15 RPM, 150 RPD Details
Groq deepseek-r1-distill-70b 131K 8K text 30 RPM, 14,400 RPD Details
Kilo Code nvidia/nemotron-3-super-120b-a12b:free 262K 32K text ~200 req/hr Details
Kilo Code arcee-ai/trinity-large-thinking:free 131K 131K text ~200 req/hr Details
LLM7.io deepseek-r1-0528 131K 131K text 30 RPM (120 with token) Details
Ollama Cloud deepseek-r1:cloud 128K 131K text Session/weekly limits (unpublished) Details
OVHcloud AI Endpoints DeepSeek-R1-Distill-Llama-70B 131K 32K text 2 RPM (anonymous) Details
SiliconFlow deepseek-ai/DeepSeek-R1-0528-Qwen3-8B 33K 16K text 1,000 RPM, 50K TPM Details
SiliconFlow deepseek-ai/DeepSeek-R1-Distill-Qwen-7B 131K 131K text 1,000 RPM, 50K TPM Details
SiliconFlow THUDM/GLM-4.1V-9B-Thinking 66K 66K text 1,000 RPM, 50K TPM Details
NVIDIA NIM nvidia/llama-3.1-nemotron-ultra-253b-v1 131K 8K text Up to 40 RPM Details
NVIDIA NIM nvidia/llama-3.3-nemotron-super-49b-v1.5 131K 16K text Up to 40 RPM Oct 10, 2025 Details
OpenRouter NVIDIA: Llama Nemotron Embed VL 1B V2 (free) 131K 8K textimageembeddings See provider page Feb 25, 2026 Details
Chutes.ai DeepSeek-R1 131K 0 text Community-powered, no hard cap Details
Grok (xAI) Grok-2 131K 0 text $25/month free credits, resets monthly Details
GitHub Models Phi-4 131K 0 text See provider page Details