Documentation

What is EzAI API?

EzAI API is a unified AI gateway that gives you access to 20+ models from Anthropic, OpenAI, Google, and xAI through a single endpoint. Fully compatible with Claude Code, Cursor, Cline, and any Anthropic or OpenAI-compatible tool.

20+ AI Models

Claude, GPT, Gemini, Grok — one API

Real-time Dashboard

Live usage, costs & request tracking

20+ Premium Models

Claude, GPT, Gemini & more

How We Keep Costs Low

  • Smart caching — Maximizing prompt cache hits to cut token costs
  • Infrastructure optimization — Efficient routing reduces overhead
  • Transparent pricing — Pay per token, no hidden fees, no subscriptions required

Quickstart

Up and running in under a minute. Sign up to get your API key.

1

Get your API key

Sign in and copy your API key from the dashboard.

2

Run the install command

Sets ANTHROPIC_BASE_URL, ANTHROPIC_API_KEY, and configures ~/.claude/settings.json.

View source: install.sh · install.ps1

3

Restart your terminal

source ~/.bashrc  # or ~/.zshrc
4

Start using Claude Code

claude

That's it! Claude Code routes through EzAI automatically.

Manual Installation

Prefer manual setup? Two steps:

1. Set environment variables

Add to ~/.bashrc, ~/.zshrc, or your shell config:

export ANTHROPIC_BASE_URL=""
export ANTHROPIC_API_KEY="YOUR_KEY"

2. Update settings.json

Create or edit ~/.claude/settings.json:

{
  "env": {
    "ANTHROPIC_BASE_URL": "",
    "ANTHROPIC_API_KEY": "YOUR_KEY"
  },
  "disableLoginPrompt": true
}

Uninstall

Remove the proxy configuration:

# Remove env vars from your shell config
unset ANTHROPIC_BASE_URL
unset ANTHROPIC_API_KEY

# Remove Claude Code settings
rm ~/.claude/settings.json

Authentication

Include your API key in every request:

# Required headers
x-api-key: sk-your-api-key
content-type: application/json
anthropic-version: 2023-06-01

Base URL: — all requests go here. Get your key from the dashboard.

Create Message

POST /v1/messages

Send a message to any model. The proxy auto-converts between formats upstream — you always use the same Anthropic Messages format.

curl /v1/messages \
  -H "x-api-key: sk-your-key" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "max_tokens": 1024,
    "messages": [
      { "role": "user", "content": "Hello, Claude!" }
    ]
  }'
curl /v1/messages \
  -H "x-api-key: sk-your-key" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "max_tokens": 1024,
    "messages": [
      { "role": "user", "content": "Hello, GPT!" }
    ]
  }'

Note: Request uses Anthropic format. The proxy auto-converts to OpenAI format upstream and normalizes the response back.

curl /v1/messages \
  -H "x-api-key: sk-your-key" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "gemini-2.5-pro",
    "max_tokens": 1024,
    "messages": [
      { "role": "user", "content": "Hello, Gemini!" }
    ]
  }'

Response follows the standard Anthropic Messages API format for all models.

Streaming

Add "stream": true to receive Server-Sent Events:

curl /v1/messages \
  -H "x-api-key: sk-your-key" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "max_tokens": 1024,
    "stream": true,
    "messages": [
      { "role": "user", "content": "Write a haiku" }
    ]
  }'

SSE events: message_startcontent_block_deltamessage_stop. See Anthropic streaming docs.

Extended Thinking

Enable Claude's chain-of-thought reasoning for complex tasks:

{
  "model": "claude-opus-4-6",
  "max_tokens": 16000,
  "stream": true,
  "thinking": {
    "type": "enabled",
    "budget_tokens": 10000
  },
  "messages": [
    { "role": "user", "content": "Solve this step by step..." }
  ]
}

Supported Models

claude-opus-4-6 · claude-sonnet-4-5 · claude-sonnet-4

Best Practices

budget_tokens: 5K–10K for most tasks. Max: 32,000. Always use with streaming.

Count Tokens

POST /v1/messages/count_tokens Free — no credits charged
curl /v1/messages/count_tokens \
  -H "x-api-key: sk-your-key" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{ "model": "claude-sonnet-4-5", "messages": [{ "role": "user", "content": "Hello!" }] }'

// → { "input_tokens": 12 }

OpenAI Compatible Endpoint

POST /v1/chat/completions

Native OpenAI Chat Completions format. Use with OpenAI SDK, LiteLLM, or any OpenAI-compatible tool.

curl /v1/chat/completions \
  -H "Authorization: Bearer sk-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [
      { "role": "system", "content": "You are helpful." },
      { "role": "user", "content": "Hello!" }
    ],
    "max_tokens": 1024
  }'

Response (OpenAI format):

{
  "id": "req_abc123",
  "object": "chat.completion",
  "model": "claude-sonnet-4-5",
  "choices": [{
    "index": 0,
    "message": { "role": "assistant", "content": "Hello! How can I help?" },
    "finish_reason": "stop"
  }],
  "usage": { "prompt_tokens": 15, "completion_tokens": 9, "total_tokens": 24 }
}
GET /v1/models

Returns all available models in OpenAI format. Useful for model discovery in tools like LiteLLM.

POST /v1/responses NEW

OpenAI Responses API format. Compatible with n8n, LangChain, and tools using the newer OpenAI SDK.

curl /v1/responses \
  -H "Authorization: Bearer sk-your-key" \
  -H "Content-Type: application/json" \
  -d '{ "model": "claude-sonnet-4-5", "input": "Hello!", "max_output_tokens": 1024 }'

Accepts input (string or message array) + instructions (system prompt). Auto-converted to Anthropic format upstream.

Supported Models

C
Claude
  • claude-opus-4-6
  • claude-opus-4-5
  • claude-sonnet-4-6 new
  • claude-sonnet-4-5
  • claude-sonnet-4
  • claude-haiku-4-5
G
GPT / OpenAI
  • gpt-5.4 new ★
  • gpt-5.3-codex
  • gpt-5.2 / gpt-5.2-codex
  • gpt-5.1 / gpt-5.1-codex
  • gpt-5.1-codex-mini / max
  • gpt-5 / gpt-5-mini
  • gpt-4.1 / gpt-4.1-mini / nano
  • gpt-4o / gpt-4o-mini
  • o3 / o3-mini / o4-mini
G
Gemini
  • gemini-3-pro preview
  • gemini-3-flash preview
  • gemini-2.5-pro
  • gemini-2.5-flash
X
xAI
  • grok-code-fast-1

All models accessible via /v1/messages (Anthropic), /v1/chat/completions (OpenAI) and /v1/responses (Responses API) formats.

SDK Examples

# pip install anthropic
import anthropic

client = anthropic.Anthropic(
    api_key="sk-your-key",
    base_url="https://ezaiapi.com"
)

# Claude
msg = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
print(msg.content[0].text)

# GPT — same client, just change model
msg = client.messages.create(model="gpt-4.1", max_tokens=1024,
    messages=[{"role": "user", "content": "Hello GPT!"}])

# Gemini
msg = client.messages.create(model="gemini-2.5-pro", max_tokens=1024,
    messages=[{"role": "user", "content": "Hello Gemini!"}])
// npm install @anthropic-ai/sdk
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  apiKey: 'sk-your-key',
  baseURL: 'https://ezaiapi.com'
});

const msg = await client.messages.create({
  model: 'claude-sonnet-4-5',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Hello!' }]
});
console.log(msg.content[0].text);

OpenAI SDK Integration

Use the official OpenAI SDK to access all EzAI models via the /v1/chat/completions endpoint.

⚠️ Important: Cloudflare blocks the default User-Agent: OpenAI/Python header with a 403 error. You must set a custom User-Agent as shown below.

# pip install openai
from openai import OpenAI

client = OpenAI(
    base_url="https://ezaiapi.com/v1",
    api_key="sk-your-key",
    default_headers={"User-Agent": "EzAI/1.0"}  # Required!
)

response = client.chat.completions.create(
    model="claude-sonnet-4-5",
    messages=[
        {"role": "system", "content": "You are helpful."},
        {"role": "user", "content": "Hello!"}
    ],
    max_tokens=1024
)
print(response.choices[0].message.content)
// npm install openai
import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://ezaiapi.com/v1',
  apiKey: 'sk-your-key',
  defaultHeaders: { 'User-Agent': 'EzAI/1.0' }  // Required!
});

const resp = await client.chat.completions.create({
  model: 'claude-sonnet-4-5',
  messages: [
    { role: 'system', content: 'You are helpful.' },
    { role: 'user', content: 'Hello!' }
  ],
  max_tokens: 1024
});
console.log(resp.choices[0].message.content);

Compatibility

EzAI works as a drop-in replacement for any Anthropic or OpenAI-compatible tool.

Tool Status Setup
Claude Code✅ NativeSet env vars or run install script
Cursor✅ WorksSettings → Models → Custom API provider
Cline✅ WorksExtension settings → Anthropic → Base URL
Continue✅ Worksconfig.json → custom provider
Aider✅ WorksSet ANTHROPIC_BASE_URL env var
OpenAI SDK✅ WorksSet custom User-Agent (see above)
LiteLLM✅ WorksSet api_base + api_key
Any Anthropic tool✅ WorksChange base URL + API key

Rate Limits

Rate limits are applied per account based on your tier. Two limits apply simultaneously:

RPM
Requests per minute
Concurrent
Max parallel requests
5 min
IP block auto-expires
Tier RPM Concurrent Daily Requests Condition
Free5150$0 balance, no plan
Top-up302500Has credit balance
Starter3021,000$10/month plan
Pro6032,000$20/month plan
Max9065,000$40/month plan
Ultra120810,000$80/month plan

💎 Monthly Plans — Get free credits that reset every 5 hours + higher rate limits + daily request quota. Multiple plans stack daily limits. View plans →

Response Headers

Every API response includes rate limit headers so your client can track usage:

x-ratelimit-tier: topup
x-ratelimit-limit-requests: 30          # RPM limit for your tier
x-ratelimit-remaining-requests: 28      # Remaining requests this minute
x-ratelimit-reset-requests: 2026-03-03T09:00:00.000Z
x-concurrent-limit: 5                   # Max parallel requests
x-concurrent-remaining: 3               # Available parallel slots
  • Exceeding any limit returns 429 with a Retry-After header
  • 50+ consecutive failed auth attempts from the same IP triggers auto-block (403)
  • IP blocks expire automatically after 5 minutes
  • Balance exhaustion returns 402 — top up
  • Your current tier is shown in your dashboard — upgrade plan for higher limits

Error Codes

Code Meaning What to Do
400Bad request / invalid bodyCheck your JSON payload and required fields
401Invalid or missing API keyCheck x-api-key header value
402Insufficient balanceTop up credits in dashboard
403IP blocked or Cloudflare WAFWait 5 min (IP block) or set custom User-Agent (SDK)
405Method not allowedUse POST for /v1/messages
408Request timeoutRetry — upstream model was slow to respond
413Payload too largeReduce input tokens or message count
429Rate limit exceeded (RPM or concurrent)Check Retry-After header; reduce request frequency
502Upstream errorRetry after a few seconds
503All upstream providers busyRetry with exponential backoff
529Upstream overloadedWait and resend — automatic retry recommended

Troubleshooting

403 Forbidden with OpenAI SDK

Cloudflare WAF blocks the default User-Agent: OpenAI/Python header. Set a custom User-Agent:

client = OpenAI(
    base_url="https://ezaiapi.com/v1",
    api_key="sk-your-key",
    default_headers={"User-Agent": "EzAI/1.0"}
)

403 Forbidden — IP Blocked

Your IP was auto-blocked after 50+ failed auth attempts. Wait 5 minutes for the block to expire, then fix your API key. Check the key in your dashboard.

429 Too Many Requests

You've exceeded your tier's rate limit (RPM or concurrent). Check the Retry-After header for when to retry. If your balance is exhausted (402), top up in the dashboard. See Rate Limits for your tier's limits.

Connection Timeout / Reset

Intermittent upstream connection issues. The proxy automatically retries. If persistent, try again after a few seconds — the upstream model may be under heavy load.

Claude Code not using proxy

Verify your environment variables are set:

echo $ANTHROPIC_BASE_URL   # Should show https://ezaiapi.com
echo $ANTHROPIC_API_KEY  # Should show your API key
cat ~/.claude/settings.json # Should contain env block

Changelog

Mar 2026
  • +GPT-5.3 Codex, GPT-4.1 Mini/Nano, o3/o3-mini/o4-mini
  • +Streaming usage fix: input_tokens now reported correctly in message_delta
  • +Cache token tracking in Usage tab
  • +Responses API endpoint /v1/responses (n8n, LangChain)
  • +Legacy model auto-mapping (Claude 3.x → 4.x)
  • +GPT-5.1 Codex Mini/Max, GPT-5.2 Codex
  • +LZT Market payment (Invoice + Transfer)
  • +Crypto payment via NOWPayments (350+ coins)
  • +Per-user rate limiting (RPM + Concurrent, 6 tiers)
  • +Reseller system (reseller panel, end-user management, API)
  • +Database optimization (daily aggregation, auto-cleanup)
  • +Standalone reseller panel at /reseller
Feb 2026
  • +Free models: Step 3.5 Flash, GLM 4.5 Air, Nemotron 3 Nano
  • +OpenAI-compatible endpoint /v1/chat/completions
  • +Claude Sonnet 4.6 support
  • +Model discovery endpoint GET /v1/models
Jan 2026
  • +GPT-5.2, GPT-5.2-codex
  • +Gemini 3.1 Pro
  • +Extended thinking support
  • +Real-time usage dashboard
Dec 2025
  • 🚀EzAI API launch