GLM-4.6V-Flash — Free AI Model & API
z-ai-zhipu-ai/glm-4-6v-flash Overview
This Z AI model, GLM-4.6V-Flash, is a large language model with 128k tokens and 4k token output, ideal for text-based chat applications. It's openAI compatible and doesn't require a credit card, with a rate limit of one request at a time.
Quick Start
Integrate GLM-4.6V-Flash with 3 lines of code. See the config generator for Claude Code, Cursor, and more.
Other Free Models from Z AI (Zhipu AI)
Rate Limits & Constraints
Z AI (Zhipu AI) Platform Limitations
- 1 concurrent request only — unusable for multi-user apps
- China-hosted — high latency outside Asia
- Chinese phone number required for registration
Features & Use Cases
Best For
Modality Support
Z AI (Zhipu AI) Highlights
- GLM-4.7-Flash: 200K context
- No credit card required
- Chinese + English bilingual
- OpenAI-compatible endpoint
Playground — Test GLM-4.6V-Flash
Test GLM-4.6V-Flash directly in your browser. Your API key is sent directly to Z AI (Zhipu AI) — never stored.
🔒 Your key is never stored — sent directly to the model provider via our server proxy.
Ready to chat with GLM-4.6V-Flash.
Frequently Asked Questions
How do I get an API key for GLM-4.6V-Flash?
Sign up at Z AI (Zhipu AI) to get your API key. No credit card is required — just an email sign-up. Once you have the key, use the code snippets in the Quick Start section above.
Is GLM-4.6V-Flash really free?
Yes. GLM-4.6V-Flash is available on Z AI (Zhipu AI)'s free tier and has been free since May 10, 2026. Rate limits apply: 1 concurrent request. Always check the provider's terms for any changes to the free tier.
What are GLM-4.6V-Flash's rate limits?
1 concurrent request Context window: 128K. Max output: 4K. No credit card required.
What are the best free alternatives to GLM-4.6V-Flash?
Popular free alternatives include inclusionAI: Ring-2.6-1T, Baidu Qianfan: CoBuddy (free), Owl Alpha. You can also browse all 147+ free models on our site.
More questions? See our full FAQ →