How to Get a Free NVIDIA NIM API Key (2026)
16 free models available — no credit card required. Get your NVIDIA NIM API key →
Overview
100+ open models from NVIDIA — no credit card, 40 RPM.
NVIDIA NIM (NVIDIA Inference Microservices) provides API access to 100+ open-weight models hosted on NVIDIA infrastructure. The free tier is available to all NVIDIA Developer Program members (free sign-up) with a limit of ~40 requests/minute. Models include Llama, Mistral, DeepSeek-R1, Nemotron, and domain-specific variants. All endpoints are OpenAI-compatible.
- 100+ open models available
- No daily token cap
- ~40 RPM free tier
- No credit card required
API Compatibility: OpenAI SDK-compatible (Chat Completions)
Quick Start Guide
- 1 Sign up at build.nvidia.com Free NVIDIA Developer account. No credit card.
- 2 Go to Settings → API Keys
- 3 Generate an API key
- 4 Browse available models 100+ open models. Nemotron Super 49B recommended.
- 5 Configure OpenAI client Base URL: https://integrate.api.nvidia.com/v1
All Free NVIDIA NIM Models — Context Windows & Rate Limits
| Model | Context | Max Output | Modality | Rate Limit | Released | Status | |
|---|---|---|---|---|---|---|---|
| deepseek-ai/deepseek-v4-flash | 1.0M | 384K | Up to 40 RPM | — | Online | Details | |
| deepseek-ai/deepseek-v4-pro | 131K | 8K | Up to 40 RPM | — | Unavailable | Details | |
| meta/llama-3.1-70b-instruct | 131K | 16K | Up to 40 RPM | — | Online | Details | |
| meta/llama-3.2-11b-vision-instruct | 131K | 16K | Up to 40 RPM | — | Online | Details | |
| meta/llama-3.2-1b-instruct | 131K | 60K | Up to 40 RPM | — | Online | Details | |
| meta/llama-3.2-3b-instruct | 131K | 8K | Up to 40 RPM | — | Online | Details | |
| meta/llama-guard-4-12b | 164K | 16K | Up to 40 RPM | — | Online | Details | |
| minimaxai/minimax-m2.7 | 205K | 131K | Up to 40 RPM | — | Online | Details | |
| mistralai/mistral-large-2-instruct | 131K | 8K | Up to 40 RPM | Feb 26, 2024 | Unavailable | Details | |
| moonshotai/kimi-k2.6 | 262K | 8K | Up to 40 RPM | Apr 20, 2026 | Unavailable | Details | |
| nvidia/llama-3.1-nemotron-ultra-253b-v1 | 131K | 8K | Up to 40 RPM | — | Unavailable | Details | |
| nvidia/llama-3.3-nemotron-super-49b-v1.5 | 131K | 16K | Up to 40 RPM | Oct 10, 2025 | Online | Details | |
| qwen/qwen3.5-122b-a10b | 262K | 66K | Up to 40 RPM | Feb 25, 2026 | Online | Details | |
| qwen/qwen3.5-397b-a17b | 262K | 66K | Up to 40 RPM | Feb 16, 2026 | Online | Details | |
| stepfun-ai/step-3.5-flash | 262K | 66K | Up to 40 RPM | — | Online | Details | |
| z-ai/glm-5.1 | 203K | 8K | Up to 40 RPM | Apr 7, 2026 | Unavailable | Details |
Pricing & Limits
Credit Card Not required
Free Tier Permanently free
Context Range 131K – 1.0M
Total Models 16 free
Rate Limits Up to 40 RPM
API Compatibility OpenAI SDK-compatible (Chat Completions)
Use Cases
What NVIDIA NIM's free models are best for, based on aggregated model capabilities:
Limitations & Caveats
- ~40 RPM shared across all models, not per-model
- Some models require additional registration per model family
- Unavailable models listed in catalog but uncallable with standard key