Models

Pass any model id in the model field. Browse the full, filterable catalog with live prices on the Models page.

Listing models

curl https://api.relay.com/v1/models \
  -H "Authorization: Bearer $RELAY_KEY"

Common model ids

Model id	Good for	~$/M in · out
`deepseek-v4-flash`	Cheap general chat	0.10 · 0.20
`qwen3-235b`	High-volume chat	0.09 · 0.10
`llama-31-8b`	Cheapest small tasks	0.02 · 0.05
`deepseek-v4-pro`	Code & reasoning	1.30 · 2.60
`qwen-coder`	Coding	0.30 · 1.20
`gemini-31-flash-lite`	Cheap closed model	0.10 · 0.40

Prices are an illustrative June 2026 snapshot — see the Models page for current rates.

Open vs closed

Open-weight models run on Western inference providers (DeepInfra, Novita, Together, etc.). A few cheap closed models (e.g. Gemini Flash-Lite) are included for honest comparison.

Choosing a model

Cost-first: start with deepseek-v4-flash or qwen3-235b.
Code: deepseek-v4-pro or qwen-coder.
Tiny/cheap: llama-31-8b for classification and extraction.

← Authentication Next: Chat completions →