Weaving cosmic threads...
Weaving cosmic threads...
Transparent model intelligence for creators. Live benchmarks, free model tracking, and production routing across 352 models from 56 providers. Updated hourly from OpenRouter.
Search, filter, and compare 352 models with real-time pricing from OpenRouter.
Showing 50 of 352 of 352 models | 30 free
ai21
Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in grounding, instruction-following, and overall efficiency. Built on a hybrid SSM-Transformer architecture with a 256K context...
aion-labs
Aion-1.0 is a multi-model system designed for high performance across various tasks, including reasoning and coding. It is built on DeepSeek-R1, augmented with additional models and techniques such as Tree...
aion-labs
Aion-1.0-Mini 32B parameter model is a distilled version of the DeepSeek-R1 model, designed for strong performance in reasoning domains such as mathematics, coding, and logic. It is a modified variant...
aion-labs
Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytelling. It is particularly strong at introducing tension, crises, and conflict into stories, making narratives feel more engaging....
aion-labs
Aion-RP-Llama-3.1-8B ranks the highest in the character evaluation portion of the RPBench-Auto benchmark, a roleplaying-specific variant of Arena-Hard-Auto, where LLMs evaluate each other’s responses. It is a fine-tuned base model...
alfredpros
A finetuned 7 billion parameters Code LLaMA - Instruct model to generate Solidity smart contract using 4-bit QLoRA finetuning provided by PEFT library.
allenai
OLMo-2 32B Instruct is a supervised instruction-finetuned variant of the OLMo-2 32B March 2025 base model. It excels in complex reasoning and instruction-following tasks across diverse benchmarks such as GSM8K,...
allenai
Olmo 3 32B Think is a large-scale, 32-billion-parameter model purpose-built for deep reasoning, complex logic chains and advanced instruction-following scenarios. Its capacity enables strong performance on demanding evaluation tasks and...
allenai
Olmo 3.1 32B Instruct is a large-scale, 32-billion-parameter instruction-tuned language model engineered for high-performance conversational AI, multi-turn dialogue, and practical instruction following. As part of the Olmo 3.1 family, this...
amazon
Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos to generate text. Nova 2 Lite demonstrates standout capabilities in processing...
amazon
Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focused on fast processing of image, video, and text inputs to generate text output. Amazon Nova Lite...
amazon
Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in the Amazon Nova family of models at a very low cost. With a context length...
amazon
Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models.
amazon
Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providing a combination of accuracy, speed, and cost for a wide range of tasks. As of December...
anthropic
Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant responsiveness. Quick and accurate targeted performance. See the launch announcement and benchmark results [here](https://www.anthropic.com/news/claude-3-haiku) #multimodal
anthropic
Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in real-time applications, it delivers quick response times that are essential for dynamic...
anthropic
Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rapid responses and...
anthropic
Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rapid responses and...
anthropic
Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude models. Matching Claude Sonnet 4’s performance...
anthropic
Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in...
anthropic
Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning, and agentic tasks. It achieves 74.5% on SWE-bench Verified and shows notable gains...
anthropic
Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon computer use. It offers strong multimodal capabilities, competitive performance across real-world coding and...
anthropic
Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. It is built for agents that operate across entire workflows rather than single prompts, making it especially effective...
anthropic
Fast-mode variant of [Opus 4.6](/anthropic/claude-opus-4.6) - identical capabilities with higher output speed at premium 6x pricing. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode
anthropic
Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%),...
anthropic
Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows. It delivers state-of-the-art performance on coding benchmarks such as SWE-bench Verified, with...
anthropic
Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work. It excels at iterative development, complex codebase navigation, end-to-end project management with...
arcee-ai
Coder‑Large is a 32 B‑parameter offspring of Qwen 2.5‑Instruct that has been further trained on permissively‑licensed GitHub, CodeSearchNet and synthetic bug‑fix corpora. It supports a 32k context window, enabling multi‑file...
arcee-ai
Maestro Reasoning is Arcee's flagship analysis model: a 32 B‑parameter derivative of Qwen 2.5‑32 B tuned with DPO and chain‑of‑thought RL for step‑by‑step logic. Compared to the earlier 7 B...
arcee-ai
Spotlight is a 7‑billion‑parameter vision‑language model derived from Qwen 2.5‑VL and fine‑tuned by Arcee AI for tight image‑text grounding tasks. It offers a 32 k‑token context window, enabling rich multimodal...
arcee-ai
Trinity-Large-Preview is a frontier-scale open-weight language model from Arcee, built as a 400B-parameter sparse Mixture-of-Experts with 13B active parameters per token using 4-of-256 expert routing. It excels in creative writing,...
arcee-ai
Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance in PinchBench, agentic workloads, and reasoning tasks. Launch video: https://youtu.be/Gc82AXLa0Rg?si=4RLn6WBz33qT--B7
arcee-ai
Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for efficient reasoning over long contexts (131k) with robust function...
arcee-ai
Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for efficient reasoning over long contexts (131k) with robust function...
arcee-ai
Virtuoso‑Large is Arcee's top‑tier general‑purpose LLM at 72 B parameters, tuned to tackle cross‑domain reasoning, creative writing and enterprise QA. Unlike many 70 B peers, it retains the 128 k...
openrouter
Your prompt will be processed by a meta-model and routed to one of dozens of models (see below), optimizing for the best possible output. To see which model was used,...
baidu
A sophisticated text-based Mixture-of-Experts (MoE) model featuring 21B total parameters with 3B activated per token, delivering exceptional multimodal understanding and generation through heterogeneous MoE structures and modality-isolated routing. Supporting an...
baidu
ERNIE-4.5-21B-A3B-Thinking is Baidu's upgraded lightweight MoE model, refined to boost reasoning depth and quality for top-tier performance in logical puzzles, math, science, coding, text generation, and expert-level academic benchmarks.
baidu
ERNIE-4.5-300B-A47B is a 300B parameter Mixture-of-Experts (MoE) language model developed by Baidu as part of the ERNIE 4.5 series. It activates 47B parameters per token and supports text generation in...
baidu
A powerful multimodal Mixture-of-Experts chat model featuring 28B total parameters with 3B activated per token, delivering exceptional text and vision understanding through its innovative heterogeneous MoE structure with modality-isolated routing....
baidu
ERNIE-4.5-VL-424B-A47B is a multimodal Mixture-of-Experts (MoE) model from Baidu’s ERNIE 4.5 series, featuring 424B total parameters with 47B active per token. It is trained jointly on text and image data...
openrouter
Transform your natural language requests into structured OpenRouter API request objects. Describe what you want to accomplish with AI models, and Body Builder will construct the appropriate API calls. Example:...
bytedance-seed
Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates multimodal capabilities and adaptive deep thinking with a 256K context window.
bytedance-seed
Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting both text and visual understanding. It features a 256k context window and can generate outputs of...
bytedance-seed
Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong multimodal and agent capabilities while offering noticeably lower latency, making it a practical default choice for most production workloads across...
bytedance-seed
Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive scenarios, emphasizing fast response and flexible inference deployment. It delivers performance comparable to ByteDance-Seed-1.6, supports 256k context, four reasoning effort modes (minimal/low/medium/high), multimodal understanding,...
bytedance
UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, web browsers, mobile systems, and games. Built by ByteDance, it builds upon the UI-TARS framework with reinforcement...
cohere
Command A is an open-weights 111B parameter model with a 256k context window focused on delivering great performance across agentic, multilingual, and coding use cases. Compared to other leading proprietary...
cohere
command-r-08-2024 is an update of the [Command R](/models/cohere/command-r) with improved performance for multilingual retrieval-augmented generation (RAG) and tool use. More broadly, it is better at math, code and reasoning and...
cohere
command-r-plus-08-2024 is an update of the [Command R+](/models/cohere/command-r-plus) with roughly 50% higher throughput and 25% lower latencies as compared to the previous Command R+ version, while keeping the hardware footprint...
30 free models available in current view
These models are available at zero cost through Zen routing. No API key required. Updated weekly.
MiniMax
Alibaba
Xiaomi
Moonshot
Zhipu
Zhipu
Meta
OpenAI
NVIDIA
Every model we track, sorted by SWE-Bench Verified score. Pricing is per million tokens.
| # | Model | Provider | Context | SWE-Bench | Input | Output | Speed | Category |
|---|---|---|---|---|---|---|---|---|
| 1 | 🟤Claude Opus 4 | Anthropic | 200K | 90% | $15 | $75 | 40 t/s | frontier |
| 2 | 🟣MiniMax M2.5FREE | MiniMax | 200K | 80.2% | Free | Free | 75 t/s | free tier |
| 3 | 🟠Qwen 3.6 PlusFREE | Alibaba | 1M | 78.8% | Free | Free | 85 t/s | free tier |
| 4 | 🔶MiMo V2 ProFREE | Xiaomi | 1M | 78% | Free | Free | 80 t/s | free tier |
| 5 | 🌙Kimi K2.5FREE | Moonshot | 260K | 76.8% | Free | Free | 80 t/s | free tier |
| 6 | 🟡GLM 4.7FREE | Zhipu | 200K | 73.8% | Free | Free | 70 t/s | free tier |
| 7 | 🟤Claude Sonnet 4 | Anthropic | 200K | 72.7% | $3 | $15 | 80 t/s | frontier |
| 8 | 🟡Big Pickle (GLM-4.6)FREE | Zhipu | 200K | 70% | Free | Free | 65 t/s | free tier |
| 9 | 🦙Llama 4 MaverickFREE | Meta | 1M | 50% | Free | Free | 70 t/s | open source |
| 10 | 🔵Gemini 2.0 Pro | 1M | 48% | $1.25 | $5 | 70 t/s | frontier | |
| 11 | 🐋DeepSeek V3 | DeepSeek | 64K | 42% | $0.27 | $1.1 | 60 t/s | open source |
| 12 | 🟤Claude Haiku 3.5 | Anthropic | 200K | 40.6% | $0.25 | $1.25 | 150 t/s | frontier |
| 13 | 🟢GPT-4o | OpenAI | 128K | 38.4% | $2.5 | $10 | 90 t/s | frontier |
| 14 | 🔵Gemini 2.0 Flash | 1M | 33% | $0.075 | $0.3 | 160 t/s | frontier | |
| 15 | 🟢GPT-4o Mini | OpenAI | 128K | 23.7% | $0.15 | $0.6 | 130 t/s | frontier |
| 16 | 🟢GPT-5 NanoFREE | OpenAI | 128K | -- | Free | Free | 200 t/s | free tier |
| 17 | 💚Nemotron 3 SuperFREE | NVIDIA | 1M | -- | Free | Free | 90 t/s | free tier |
Estimate monthly costs for your use case. Real pricing from OpenRouter.
Estimate monthly costs for any use case. Choose a preset or enter custom token counts to compare models side by side.
~500 input + 200 output tokens per request
3,000 requests/month
30 free models available for this use case: Google: Gemma 4 26B A4B (free), Google: Gemma 4 31B (free), Qwen: Qwen3.6 Plus (free), Google: Lyria 3 Pro Preview, Google: Lyria 3 Clip Preview, NVIDIA: Nemotron 3 Super (free), MiniMax: MiniMax M2.5 (free), Free Models Router, StepFun: Step 3.5 Flash (free), Arcee AI: Trinity Large Preview (free), LiquidAI: LFM2.5-1.2B-Thinking (free), LiquidAI: LFM2.5-1.2B-Instruct (free), NVIDIA: Nemotron 3 Nano 30B A3B (free), Arcee AI: Trinity Mini (free), NVIDIA: Nemotron Nano 12B 2 VL (free), Qwen: Qwen3 Next 80B A3B Instruct (free), NVIDIA: Nemotron Nano 9B V2 (free), OpenAI: gpt-oss-120b (free), OpenAI: gpt-oss-20b (free), Z.ai: GLM 4.5 Air (free), Qwen: Qwen3 Coder 480B A35B (free), Venice: Uncensored (free), Google: Gemma 3n 2B (free), Google: Gemma 3n 4B (free), Google: Gemma 3 4B (free), Google: Gemma 3 12B (free), Google: Gemma 3 27B (free), Meta: Llama 3.3 70B Instruct (free), Meta: Llama 3.2 3B Instruct (free), Nous: Hermes 3 405B Instruct (free)
| # | Model | Monthly | Per Request |
|---|---|---|---|
| 1 | Body Builder (beta)openrouter | <$0.01 | <$0.0001 |
| 2 | Auto Routeropenrouter | <$0.01 | <$0.0001 |
| 3 | Google: Gemma 3n 4Bgoogle | $0.05 | <$0.0001 |
| 4 | Mistral: Mistral Nemomistralai | $0.05 | <$0.0001 |
| 5 | Meta: Llama 3.1 8B Instructmeta-llama | $0.06 | <$0.0001 |
| 6 | Llama Guard 3 8Bmeta-llama | $0.07 | <$0.0001 |
| 7 | Meta: Llama 3 8B Instructmeta-llama | $0.07 | <$0.0001 |
| 8 | Sao10K: Llama 3 8B Lunarissao10k | $0.09 | <$0.0001 |
| 9 | IBM: Granite 4.0 Microibm-granite | $0.10 | <$0.0001 |
| 10 | Qwen: Qwen2.5 Coder 7B Instructqwen | $0.10 | <$0.0001 |
For 100 chat app requests/day, the cheapest paid model is Body Builder (beta) at <$0.01/month. 30 models are free.
How Arcanea routes models to specialized agents in production. Each agent is assigned to an Arcanean Gate with a primary model and fallback chain.
Orchestrator needs relentless persistence and 1M context to hold the full project state. Qwen 3.6 Plus offers the best free agentic reasoning with massive context.
Coder needs the highest SWE-Bench score available. MiniMax M2.5 leads at 80.2% — the divine forge of code.
Architecture decisions require deep reasoning over complex systems. Big Pickle excels at slow, deliberate analysis — seeing the whole picture.
Research demands vast context for ingesting papers, docs, and codebases. 1M context + strong reasoning makes Qwen the fire-bringer of knowledge.
Strategy and planning need long-context reasoning to weigh trade-offs across the entire system. Qwen delivers wisdom at scale.
Code review requires deep comprehension of implementation patterns. M2.5 at 80.2% SWE-Bench catches what others miss — the honest critic.
Coordination and frontend integration need broad context and strong UI/UX understanding. Kimi K2.5 carries the world of integrations.
Documentation and research benefit from strong multilingual capabilities. GLM 4.7 excels at structured knowledge extraction.
Navigation and exploration need speed above all else. GPT-5 Nano is the fastest free model — instant wayfinding through the codebase.
Strengths, weaknesses, and recommended use cases for the top models.
Track changes to the model roster, free tier availability, and routing decisions.
All Zen-routed models remain free this week. MiniMax M2.5 continues to lead SWE-Bench at 80.2%. Qwen 3.6 Plus remains the recommended orchestrator model for its 1M context + 78.8% SWE-Bench combination. MiMo V2 Pro gaining traction as a strong 1M-context alternative. Image generation models are now tracked in the Arena — 8 models across frontier, open-source, and specialized categories.
Compare FLUX.2, Grok Image, DALL-E 3, Stable Diffusion, and more. Pricing, speed, text rendering quality, and Arcanea pipeline routing.
Every model in the Arena is available through Arcanea. Free models run on Zen routing. Premium models run through your own API keys.