Flashship — 2026-05-05 04:01

Company ↑	Family ↑	Variant ↑	Model ↑	Input $/1M ↑	Output $/1M ↑	Reasoning $/1M ↑	Context ↑	Max Output ↑	Modalities ↑	Supported Params ↑
OpenAI	GPT-5.4	Standard	OpenAI: GPT-5.4 openai/gpt-5.4 GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window (922K input, 128K output) with support for...	$2.500	$15.00	FREE	1.1M	128K	INTEXTIMAGEFILE OUTTEXT	include_reasoningmax_completion_tokensmax_tokensreasoning+5
OpenAI	GPT-5.4	Pro	OpenAI: GPT-5.4 Pro openai/gpt-5.4-pro GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabilities for complex, high-stakes tasks. It features a 1M+ token context window (922	$30.00	$180.00	FREE	1.1M	128K	INTEXTIMAGEFILE OUTTEXT	include_reasoningmax_completion_tokensmax_tokensreasoning+5
Anthropic	Claude 4.6	Sonnet	Anthropic: Claude Sonnet 4.6 anthropic/claude-sonnet-4.6 Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work. It excels at iterative development, complex codebase navigation,	$3.000	$15.00	FREE	1M	128K	INTEXTIMAGE OUTTEXT	include_reasoningmax_completion_tokensmax_tokensreasoning+9
Anthropic	Claude 4.6	Opus	Anthropic: Claude Opus 4.6 anthropic/claude-opus-4.6 Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. It is built for agents that operate across entire workflows rather than single prompts, making it especially eff	$5.000	$25.00	FREE	1M	128K	INTEXTIMAGE OUTTEXT	include_reasoningmax_completion_tokensmax_tokensreasoning+9
Google	Gemini 3.1 Pro	Preview	Google: Gemini 3.1 Pro Preview google/gemini-3.1-pro-preview Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows	$2.000	$12.00	$12.00	1M	65.5K	INAUDIOFILEIMAGETEXTVIDEO OUTTEXT	include_reasoningmax_tokensreasoningresponse_format+7
Grok	Grok 4.20	Beta	Unavailable	—	—	—	—	—	—	—
Grok	Grok 4.20	Multi-Agent Beta	Unavailable	—	—	—	—	—	—	—
DeepSeek	DeepSeek V3.2	Base	DeepSeek: DeepSeek V3.2 deepseek/deepseek-v3.2 DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduces DeepSeek Sparse Attention (DSA), a fin	$0.2520	$0.3780	FREE	131.1K	65.5K	INTEXT OUTTEXT	frequency_penaltyinclude_reasoninglogit_biasmax_tokens+13
DeepSeek	DeepSeek V3.2	Exp	DeepSeek: DeepSeek V3.2 Exp deepseek/deepseek-v3.2-exp DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures. It introduces DeepSeek Sparse Attention (DSA), a fine-grai	$0.2700	$0.4100	FREE	163.8K	65.5K	INTEXT OUTTEXT	frequency_penaltyinclude_reasoninglogit_biasmax_tokens+13
DeepSeek	DeepSeek V3.2	Speciale	DeepSeek: DeepSeek V3.2 Speciale deepseek/deepseek-v3.2-speciale DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance. It builds on DeepSeek Sparse Attention (DSA) for efficient long-context proce	$0.4000	$1.200	FREE	163.8K	163.8K	INTEXT OUTTEXT	frequency_penaltyinclude_reasoninglogit_biasmax_tokens+11
Qwen	Qwen 3.5	397B A17B	Qwen: Qwen3.5 397B A17B qwen/qwen3.5-397b-a17b The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher infere	$0.3900	$2.340	FREE	262.1K	65.5K	INTEXTIMAGEVIDEO OUTTEXT	frequency_penaltyinclude_reasoninglogit_biaslogprobs+15
Qwen	Qwen 3.5	122B A10B	Qwen: Qwen3.5-122B-A10B qwen/qwen3.5-122b-a10b The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference eff	$0.2600	$2.080	FREE	262.1K	65.5K	INTEXTIMAGEVIDEO OUTTEXT	frequency_penaltyinclude_reasoninglogit_biaslogprobs+15
Qwen	Qwen 3.5	35B A3B	Qwen: Qwen3.5-35B-A3B qwen/qwen3.5-35b-a3b The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inf	$0.1500	$1.000	FREE	262.1K	262.1K	INTEXTIMAGEVIDEO OUTTEXT	frequency_penaltyinclude_reasoninglogit_biaslogprobs+15
Qwen	Qwen 3.5	27B	Qwen: Qwen3.5-27B qwen/qwen3.5-27b The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance. Its overall capabilities a	$0.1950	$1.560	FREE	262.1K	65.5K	INTEXTIMAGEVIDEO OUTTEXT	frequency_penaltyinclude_reasoninglogit_biaslogprobs+15
Qwen	Qwen 3.5	9B	Qwen: Qwen3.5-9B qwen/qwen3.5-9b Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-parameter architecture. It uses a unified	$0.1000	$0.1500	FREE	262.1K	—	INTEXTIMAGEVIDEO OUTTEXT	frequency_penaltyinclude_reasoninglogit_biaslogprobs+14
Qwen	Qwen 3.5	Plus	Qwen: Qwen3.5 Plus 2026-02-15 qwen/qwen3.5-plus-02-15 The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference e	$0.2600	$1.560	FREE	1M	65.5K	INTEXTIMAGEVIDEO OUTTEXT	include_reasoningmax_tokenspresence_penaltyreasoning+7
Qwen	Qwen 3.5	Flash	Qwen: Qwen3.5-Flash qwen/qwen3.5-flash-02-23 The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference effic	$0.0650	$0.2600	FREE	1M	65.5K	INTEXTIMAGEVIDEO OUTTEXT	include_reasoningmax_tokenspresence_penaltyreasoning+7
Kimi	Kimi K2.5	Standard	MoonshotAI: Kimi K2.5 moonshotai/kimi-k2.5 Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm. Built on Kimi K2 with continued pretraining over appr	$0.4400	$2.000	FREE	262.1K	65.5K	INTEXTIMAGE OUTTEXT	frequency_penaltyinclude_reasoninglogit_biaslogprobs+16
MiniMax	MiniMax M2.5	Standard	MiniMax: MiniMax M2.5 minimax/minimax-m2.5 MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments, M2.5 builds upon the coding expertise o	$0.1500	$1.150	FREE	196.6K	131.1K	INTEXT OUTTEXT	frequency_penaltyinclude_reasoninglogit_biaslogprobs+17
Z.ai	GLM-5	Standard	Z.ai: GLM 5 z-ai/glm-5 GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows. Built for expert developers, it delivers production-grade performance on l	$0.6000	$2.080	FREE	202.8K	16.4K	INTEXT OUTTEXT	frequency_penaltyinclude_reasoninglogit_biaslogprobs+15