Models

Pass any model id in the model field. Browse the full, filterable catalog with live prices on the Models page.

Listing models

curl https://api.relay.com/v1/models \
  -H "Authorization: Bearer $RELAY_KEY"

Common model ids

Model idGood for~$/M in · out
deepseek-v4-flashCheap general chat0.10 · 0.20
qwen3-235bHigh-volume chat0.09 · 0.10
llama-31-8bCheapest small tasks0.02 · 0.05
deepseek-v4-proCode & reasoning1.30 · 2.60
qwen-coderCoding0.30 · 1.20
gemini-31-flash-liteCheap closed model0.10 · 0.40

Prices are an illustrative June 2026 snapshot — see the Models page for current rates.

Open vs closed

Open-weight models run on Western inference providers (DeepInfra, Novita, Together, etc.). A few cheap closed models (e.g. Gemini Flash-Lite) are included for honest comparison.

Choosing a model

  • Cost-first: start with deepseek-v4-flash or qwen3-235b.
  • Code: deepseek-v4-pro or qwen-coder.
  • Tiny/cheap: llama-31-8b for classification and extraction.