Vercel AI Gateway

Vercel AI Gateway gives you a single API to access models from many providers. You switch by model id without swapping SDKs or juggling multiple keys. Caret integrates directly so you can pick a Gateway model in the dropdown, use it like any other provider, and see token and cache usage in the stream.

Useful links:

Team dashboard: https://vercel.com/d?to=%2F%5Bteam%5D%2F%7E%2Fai
Models catalog: https://vercel.com/ai-gateway/models
Docs: https://vercel.com/docs/ai-gateway

What you get

One endpoint for 100+ models with a single key
Automatic retries and fallbacks that you configure on the dashboard
Spend monitoring with requests by model, token counts, cache usage, latency percentiles, and cost
OpenAI-compatible surface so existing clients work

Getting an API Key

Sign in at https://vercel.com
Dashboard → AI Gateway → API Keys → Create key
Copy the key

For more on authentication and OIDC options, see https://vercel.com/docs/ai-gateway/authentication

Configuration in Caret

Open Caret settings
Select Vercel AI Gateway as the API Provider
Paste your Gateway API Key
Pick a model from the list. Caret fetches the catalog automatically. You can also paste an exact id

Notes:

Model ids often follow provider/model. Copy the exact id from the catalog
Examples:
- openai/gpt-5
- anthropic/claude-sonnet-4
- google/gemini-2.5-pro
- groq/llama-3.1-70b
- deepseek/deepseek-v3

Observability you can act on

Vercel AI Gateway observability with requests by model, tokens, cache, latency, and cost.

What to watch:

Requests by model - confirm routing and adoption
Tokens - input vs output, including reasoning if exposed
Cache - cached input and cache creation tokens
Latency - p75 duration and p75 time to first token
Cost - per project and per model

Use it to:

Compare output tokens per request before and after a model change
Validate cache strategy by tracking cache reads and write creation
Catch TTFT regressions during experiments
Align budgets with real usage

Supported models

The gateway supports a large and changing set of models. Caret pulls the list from the Gateway API and caches it locally. For the current catalog, see https://vercel.com/ai-gateway/models

Tips

💡Tip

Use separate gateway keys per environment (dev, staging, prod). It keeps dashboards clean and budgets isolated.

ℹ️Note

Pricing is pass-through at provider list price. Bring-your-own key has 0% markup. You still pay provider and processing fees.

ℹ️Info

Vercel does not add rate limits. Upstream providers may. New accounts receive $5 credits every 30 days until the first payment.

Troubleshooting

401 - send the Gateway key to the Gateway endpoint, not an upstream URL
404 model - copy the exact id from the Vercel catalog
Slow first token - check p75 TTFT in the dashboard and try a model optimized for streaming
Cost spikes - break down by model in the dashboard and cap or route traffic

Inspiration

Multi-model evals - swap only the model id in Caret and compare latency and output tokens
Progressive rollout - route a small percent to a new model in the dashboard and ramp with metrics
Budget enforcement - set per-project limits without code changes

Crosslinks

OpenAI-Compatible setup: /provider-config/openai-compatible
Model Selection Guide: /getting-started/model-selection-guide
Understanding Context Management: /getting-started/understanding-context-management

What you get​

Getting an API Key​

Configuration in Caret​

Observability you can act on​

Supported models​

Tips​

💡Tip

ℹ️Note

ℹ️Info

Troubleshooting​

Inspiration​

Crosslinks​

What you get

Getting an API Key

Configuration in Caret

Observability you can act on

Supported models

Tips

Troubleshooting

Inspiration

Crosslinks