Model Selection Guide
New models drop constantly, so this guide focuses on what's working well with Caret right now. We'll keep it updated as the landscape shifts.
Current Top Models
| Model | Context Window | Input Price* | Output Price* | Best For |
|---|---|---|---|---|
| Claude Sonnet 4 | 1M tokens | $3-6 | $15-22.50 | Reliable tool usage, complex codebases |
| Qwen3 Coder | 256K tokens | $0.20 | $0.80 | Coding tasks, open source flexibility |
| Gemini 2.5 Pro | 1M+ tokens | TBD | TBD | Large codebases, document analysis |
| GPT-5 | 400K tokens | $1.25 | $10 | Latest OpenAI tech, three modes |
*Per million tokens
Budget Options
| Model | Context Window | Input Price* | Output Price* | Notes |
|---|---|---|---|---|
| DeepSeek V3 | 128K tokens | $0.14 | $0.28 | Great value for daily coding |
| DeepSeek R1 | 128K tokens | $0.55 | $2.19 | Budget reasoning champion |
| Qwen3 32B | 128K tokens | Varies | Varies | Open source, multiple providers |
| Z AI GLM 4.5 | 128K tokens | TBD | TBD | MIT licensed, hybrid reasoning |
*Per million tokens
Context Window Guide
| Size | Word Count | Use Case |
|---|---|---|
| 32K tokens | ~24,000 words | Single files, small projects |
| 128K tokens | ~96,000 words | Most coding projects |
| 200K tokens | ~150,000 words | Large codebases |
| 400K+ tokens | ~300,000+ words | Entire applications |
Performance note: Most models start dropping in quality around 400-500K tokens, even if they claim higher limits.
Open Source vs Closed Source
Open Source Advantages
- Multiple providers compete to host them
- Cheaper pricing due to competition
- Provider choice - switch if one goes down
- Faster innovation cycles
Open Source Models Available
- Qwen3 Coder (Apache 2.0)
- Z AI GLM 4.5 (MIT)
- Kimi K2 (Open source)
- DeepSeek series (Various licenses)
Quick Decision Matrix
| If you want... | Use this |
|---|---|
| Something that just works | Claude Sonnet 4 |
| To save money | DeepSeek V3 or Qwen3 variants |
| Huge context windows | Gemini 2.5 Pro or Claude Sonnet 4 |
| Open source | Qwen3 Coder, Z AI GLM 4.5, or Kimi K2 |
| Latest tech | GPT-5 |
| Speed | Qwen3 Coder on Cerebras (fastest available) |
What Others Are Using
Check OpenRouter's Caret usage stats to see real usage patterns from the community.
Context Management
Caret automatically handles context limits with auto-compact. When you approach your model's limit, Caret summarizes the conversation to keep working. You don't need to micromanage this.
The Bottom Line
Start with Claude Sonnet 4 if you want reliability. Experiment with open source options once you're comfortable to find the best fit for your workflow and budget.
The landscape moves fast - these recommendations reflect what's working now, but keep an eye on new releases.