ONE API ยท CHEAPEST OPEN & AFFORDABLE MODELS

The lowest effective cost
for open & affordable AI models

DeepSeek, Qwen, Kimi, GLM, Llama, Mistral and more โ€” through a single OpenAI-compatible API. Built-in caching, batch and smart routing cut your real bill far below raw rates. Transparent pricing. No silent model downgrades.

0tokens served today
0curated models
0upstream providers
0routing uptime
0avg effective savings*

*illustrative demo figures โ€” replace with real metrics before launch.

The Savings Engine

Raw token price is only half the story. We lower the bill you actually pay โ€” automatically.

๐Ÿ—‚๏ธ

Prompt Caching

Repeated prefixes cost up to 90% less on cache reads. On by default.

๐Ÿ“ฆ

Batch Lane

Async jobs run at ~50% off. Perfect for evals, labeling, nightly pipelines.

๐Ÿงญ

Cheapest-First Routing

Every request goes to the cheapest healthy provider โ€” and we show you which.

๐Ÿชœ

Smart Cascading

Easy tasks fall to smaller models; only hard ones hit the big ones.

Stack caching + batch and effective cost drops to roughly 25% of on-demand rates on eligible traffic.

Featured models

A taste of the catalog. Click a card for providers, latency and sample code.

Browse all models โ†’

Why Relay

๐Ÿ’ธ Lowest effective cost

Caching, batch and routing cut your real bill โ€” not just the sticker price.

๐Ÿ” Radically transparent

See the cheapest provider, the price, and exactly what each call cost. No hidden markup, no silent downgrades.

๐Ÿค Built for builders

OpenAI-compatible, one-line switch, free credits to start. Plus custom agents for business.

Start cutting your AI bill today

Get an API key, claim free credits, and switch from OpenAI in one line.

We'll email your key and onboarding steps.