Vercel AI Gateway
Vercel AI Gateway gives you a single API to access models from many providers. You switch by model id without swapping SDKs or juggling multiple keys. Caret integrates directly so you can pick a Gateway model in the dropdown, use it like any other provider, and see token and cache usage in the stream.
Useful links:
- Team dashboard: https://vercel.com/d?to=%2F%5Bteam%5D%2F%7E%2Fai
- Models catalog: https://vercel.com/ai-gateway/models
- Docs: https://vercel.com/docs/ai-gateway
What you get
- One endpoint for 100+ models with a single key
- Automatic retries and fallbacks that you configure on the dashboard
- Spend monitoring with requests by model, token counts, cache usage, latency percentiles, and cost
- OpenAI-compatible surface so existing clients work
Getting an API Key
- Sign in at https://vercel.com
- Dashboard → AI Gateway → API Keys → Create key
- Copy the key
For more on authentication and OIDC options, see https://vercel.com/docs/ai-gateway/authentication
Configuration in Caret
- Open Caret settings
- Select Vercel AI Gateway as the API Provider
- Paste your Gateway API Key
- Pick a model from the list. Caret fetches the catalog automatically. You can also paste an exact id
Notes:
- Model ids often follow
provider/model
. Copy the exact id from the catalog
Examples:openai/gpt-5
anthropic/claude-sonnet-4
google/gemini-2.5-pro
groq/llama-3.1-70b
deepseek/deepseek-v3
Observability you can act on
What to watch:
- Requests by model - confirm routing and adoption
- Tokens - input vs output, including reasoning if exposed
- Cache - cached input and cache creation tokens
- Latency - p75 duration and p75 time to first token
- Cost - per project and per model
Use it to:
- Compare output tokens per request before and after a model change
- Validate cache strategy by tracking cache reads and write creation
- Catch TTFT regressions during experiments
- Align budgets with real usage
Supported models
The gateway supports a large and changing set of models. Caret pulls the list from the Gateway API and caches it locally. For the current catalog, see https://vercel.com/ai-gateway/models
Tips
Tip
Use separate gateway keys per environment (dev, staging, prod). It keeps dashboards clean and budgets isolated.
Note
Pricing is pass-through at provider list price. Bring-your-own key has 0% markup. You still pay provider and processing fees.
Info
Vercel does not add rate limits. Upstream providers may. New accounts receive $5 credits every 30 days until the first payment.
Troubleshooting
- 401 - send the Gateway key to the Gateway endpoint, not an upstream URL
- 404 model - copy the exact id from the Vercel catalog
- Slow first token - check p75 TTFT in the dashboard and try a model optimized for streaming
- Cost spikes - break down by model in the dashboard and cap or route traffic
Inspiration
- Multi-model evals - swap only the model id in Caret and compare latency and output tokens
- Progressive rollout - route a small percent to a new model in the dashboard and ramp with metrics
- Budget enforcement - set per-project limits without code changes
Crosslinks
- OpenAI-Compatible setup: /provider-config/openai-compatible
- Model Selection Guide: /getting-started/model-selection-guide
- Understanding Context Management: /getting-started/understanding-context-management