Model Selection Guide

New models drop constantly, so this guide focuses on what's working well with Caret right now. We'll keep it updated as the landscape shifts.

Current Top Models

Model	Context Window	Input Price*	Output Price*	Best For
Claude Sonnet 4	1M tokens	$3-6	$15-22.50	Reliable tool usage, complex codebases
Qwen3 Coder	256K tokens	$0.20	$0.80	Coding tasks, open source flexibility
Gemini 2.5 Pro	1M+ tokens	TBD	TBD	Large codebases, document analysis
GPT-5	400K tokens	$1.25	$10	Latest OpenAI tech, three modes

*Per million tokens

Budget Options

Model	Context Window	Input Price*	Output Price*	Notes
DeepSeek V3	128K tokens	$0.14	$0.28	Great value for daily coding
DeepSeek R1	128K tokens	$0.55	$2.19	Budget reasoning champion
Qwen3 32B	128K tokens	Varies	Varies	Open source, multiple providers
Z AI GLM 4.5	128K tokens	TBD	TBD	MIT licensed, hybrid reasoning

*Per million tokens

Context Window Guide

Size	Word Count	Use Case
32K tokens	~24,000 words	Single files, small projects
128K tokens	~96,000 words	Most coding projects
200K tokens	~150,000 words	Large codebases
400K+ tokens	~300,000+ words	Entire applications

Performance note: Most models start dropping in quality around 400-500K tokens, even if they claim higher limits.

Open Source vs Closed Source

Open Source Advantages

Multiple providers compete to host them
Cheaper pricing due to competition
Provider choice - switch if one goes down
Faster innovation cycles

Open Source Models Available

Qwen3 Coder (Apache 2.0)
Z AI GLM 4.5 (MIT)
Kimi K2 (Open source)
DeepSeek series (Various licenses)

Quick Decision Matrix

If you want...	Use this
Something that just works	Claude Sonnet 4
To save money	DeepSeek V3 or Qwen3 variants
Huge context windows	Gemini 2.5 Pro or Claude Sonnet 4
Open source	Qwen3 Coder, Z AI GLM 4.5, or Kimi K2
Latest tech	GPT-5
Speed	Qwen3 Coder on Cerebras (fastest available)

What Others Are Using

Check OpenRouter's Caret usage stats to see real usage patterns from the community.

Context Management

Caret automatically handles context limits with auto-compact. When you approach your model's limit, Caret summarizes the conversation to keep working. You don't need to micromanage this.

The Bottom Line

Start with Claude Sonnet 4 if you want reliability. Experiment with open source options once you're comfortable to find the best fit for your workflow and budget.

The landscape moves fast - these recommendations reflect what's working now, but keep an eye on new releases.

Current Top Models​

Budget Options​

Context Window Guide​

Open Source vs Closed Source​

Open Source Advantages​

Open Source Models Available​

Quick Decision Matrix​

What Others Are Using​

Context Management​

The Bottom Line​