Best Free LLM APIs for Coding

18 free models available for coding. How to choose a free LLM for coding →

For AI coding, prioritize large context windows (to process entire codebases), tool calling support, and strong instruction following. The best free coding models include Codestral (Mistral, purpose-built for code), DeepSeek V4, Qwen3-Coder, and Gemini 2.5 Flash (1M context). Models are ranked below by context window and rate limit.

What to Look for in a Coding Model

Not all LLMs are equally good at coding. Here's what separates a coding model from a general-purpose one:

  • Context window — The single most important spec for coding. Modern codebases easily exceed 50K tokens. A model with less than 32K context will struggle with multi-file edits, code review, or understanding project structure. Look for at least 128K tokens; 256K+ is ideal for monorepo work.
  • Fill-in-the-Middle (FIM) — A specialized training objective where the model learns to fill a gap between prefix and suffix code. Essential for inline code completion in IDEs. Codestral and DeepSeek Coder variants are trained with FIM.
  • Tool calling / function calling — Required for agentic coding workflows: "find all files that import X, then refactor them to use Y." Without tool calling, the model can only suggest code, not execute actions. Most OpenAI-compatible endpoints support tool calling if the underlying model does.
  • Instruction following — Coding requires precise, unambiguous outputs. Models that drift or hallucinate will introduce bugs. DeepSeek V4 and Qwen3 score particularly well on instruction-following benchmarks.
  • Max output tokens — Generating a full file or multiple functions in one shot requires high output limits. 8K output is the practical minimum; 16K+ lets the model generate entire modules at once.

How to Choose a Free Coding Model

Your pick depends on how you code:

  • Using Claude Code or Cursor? → Prioritize context window and tool calling. Gemini 2.5 Flash (1M ctx) or DeepSeek V4 (256K) let the agent see your whole project. Both support tool calling via OpenAI-compatible endpoints.
  • Inline completion in VS Code / JetBrains? → Look for FIM support. Codestral (Mistral) is purpose-built for this. DeepSeek Coder variants also support FIM.
  • Code review / PR review? → Large context is critical — the diff + surrounding code + review guidelines all need to fit in one prompt. Gemini 2.5 Flash's 1M context handles this with room to spare.
  • Learning to code? → Prioritize helpfulness and explanation quality. Qwen3 and Llama 3.3 70B are known for clear, educational code explanations.
  • Rate limit sensitive? → NVIDIA NIM has 40 RPM with no daily cap, ideal for heavy coding sessions. Groq has 30 RPM / 14,400 RPD — enough for most solo developers.

Try models in the Playground with a real coding task before committing — the same benchmark scores don't always match your specific language or framework.

Top Picks for Coding

All Free Coding Models

Provider Model Context Max Output Modality Rate Limit Released
OpenRouter OpenAI: gpt-oss-safeguard-20b 131K 66K text See provider page Oct 29, 2025 Details
OpenRouter OpenAI: gpt-oss-120b (free) 131K 131K text See provider page Aug 5, 2025 Details
OpenRouter OpenAI: gpt-oss-20b (free) 131K 8K text See provider page Aug 5, 2025 Details
OpenRouter Qwen: Qwen3 Coder 480B A35B (free) 1.0M 262K textcode See provider page Feb 4, 2026 Details
Mistral AI Codestral 256K 256K textcode ~1 RPS, 500K TPM Details
Cerebras gpt-oss-120b 128K 8K text 30 RPM, 14,400 RPD, 1M TPD Details
Kilo Code x-ai/grok-code-fast-1:optimized:free 131K 131K textcode ~200 req/hr Details
LLM7.io qwen2.5-coder-32b 131K 131K textcode 30 RPM (120 with token) Details
OVHcloud AI Endpoints Qwen3-Coder-30B-A3B-Instruct 262K 32K textcode 2 RPM (anonymous) Details
Chutes.ai Llama 3.1 70B 131K 0 text Community-powered, no hard cap Details
Glhf.chat Llama 3.1 70B 131K 0 text Unlimited for free models Details
Glhf.chat Mixtral 8x7B 33K 0 text Unlimited for free models Details
Groq Moonshot Kimi K2 131K 0 text See provider page Details
Groq Moonshot Kimi K2 0905 131K 0 text See provider page Details
Groq GPT-OSS 120B 131K 0 text See provider page Details
GitHub Models Mistral Large (24.11) 131K 0 text See provider page Details
Cerebras Llama 3.1 70B 131K 0 text See provider page Details
Mistral AI Mixtral 8x7B 33K 0 text See provider page Details