How to Get a Free NVIDIA NIM API Key (2026)

16 free models available — no credit card required. Get your NVIDIA NIM API key →

Overview

100+ open models from NVIDIA — no credit card, 40 RPM.

NVIDIA NIM (NVIDIA Inference Microservices) provides API access to 100+ open-weight models hosted on NVIDIA infrastructure. The free tier is available to all NVIDIA Developer Program members (free sign-up) with a limit of ~40 requests/minute. Models include Llama, Mistral, DeepSeek-R1, Nemotron, and domain-specific variants. All endpoints are OpenAI-compatible.

  • 100+ open models available
  • No daily token cap
  • ~40 RPM free tier
  • No credit card required

API Compatibility: OpenAI SDK-compatible (Chat Completions)

Quick Start Guide

  1. 1
    Sign up at build.nvidia.com Free NVIDIA Developer account. No credit card.
  2. 2
    Go to Settings → API Keys
  3. 3
    Generate an API key
  4. 4
    Browse available models 100+ open models. Nemotron Super 49B recommended.
  5. 5
    Configure OpenAI client Base URL: https://integrate.api.nvidia.com/v1

All Free NVIDIA NIM Models — Context Windows & Rate Limits

Model Context Max Output Modality Rate Limit Released Status
deepseek-ai/deepseek-v4-flash 1.0M 384K text Up to 40 RPM Online Details
deepseek-ai/deepseek-v4-pro 131K 8K text Up to 40 RPM Unavailable Details
meta/llama-3.1-70b-instruct 131K 16K text Up to 40 RPM Online Details
meta/llama-3.2-11b-vision-instruct 131K 16K textimage Up to 40 RPM Online Details
meta/llama-3.2-1b-instruct 131K 60K text Up to 40 RPM Online Details
meta/llama-3.2-3b-instruct 131K 8K text Up to 40 RPM Online Details
meta/llama-guard-4-12b 164K 16K textimage Up to 40 RPM Online Details
minimaxai/minimax-m2.7 205K 131K text Up to 40 RPM Online Details
mistralai/mistral-large-2-instruct 131K 8K text Up to 40 RPM Feb 26, 2024 Unavailable Details
moonshotai/kimi-k2.6 262K 8K text Up to 40 RPM Apr 20, 2026 Unavailable Details
nvidia/llama-3.1-nemotron-ultra-253b-v1 131K 8K text Up to 40 RPM Unavailable Details
nvidia/llama-3.3-nemotron-super-49b-v1.5 131K 16K text Up to 40 RPM Oct 10, 2025 Online Details
qwen/qwen3.5-122b-a10b 262K 66K textimage Up to 40 RPM Feb 25, 2026 Online Details
qwen/qwen3.5-397b-a17b 262K 66K textimage Up to 40 RPM Feb 16, 2026 Online Details
stepfun-ai/step-3.5-flash 262K 66K text Up to 40 RPM Online Details
z-ai/glm-5.1 203K 8K text Up to 40 RPM Apr 7, 2026 Unavailable Details

Pricing & Limits

Credit Card Not required
Free Tier Permanently free
Context Range 131K – 1.0M
Total Models 16 free
Rate Limits Up to 40 RPM
API Compatibility OpenAI SDK-compatible (Chat Completions)

Use Cases

What NVIDIA NIM's free models are best for, based on aggregated model capabilities:

Chat 16 models Reasoning 2 models Vision 1 model

Limitations & Caveats

  • ~40 RPM shared across all models, not per-model
  • Some models require additional registration per model family
  • Unavailable models listed in catalog but uncallable with standard key
See our FAQ for common questions about free LLM APIs