Token Station

One API key, one format — access every model across 12 AI providers.

OpenAI 1.2B tkns Anthropic 980M tkns Gemini 870M tkns Groq 650M tkns xAI 420M tkns Bailian 380M tkns ALL SYSTEMS OK 4.5B TOKENS MAIN DISPLAY 99%
Get Started Available Models
🔌
OpenAI Compatible API
🧪
12 Model Labs
🤖
100+ Models
💬
Chat & Reasoning
👨‍💻
Coding
🛠️
Agentic Tools & Skills
🔗
MCPs
👁️
Image Understanding
🎨
Image Creation
🎬
Video Creation
🎤
Speech to Text
🔊
Text to Speech

Supported APIs

Chat Completions — OpenAI-compatible universal LLM API /v1/chat/completions

Works with every LLM provider. Send OpenAI-format requests — the gateway translates to each provider's native format when needed, and preserves raw OpenAI request bytes for native OpenAI traffic. Supports text, images, streaming, and tool use.

curl -X POST http://GATEWAY/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer gw-YOUR_KEY" \
  -d '{
    "model": "openai/gpt-5.4",
    "messages": [
      {"role": "system", "content": "You are helpful."},
      {"role": "user", "content": [
        {"type": "text", "text": "What is in this image?"},
        {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
      ]}
    ],
    "max_completion_tokens": 1024,
    "stream": true
  }'
from openai import OpenAI

client = OpenAI(
    base_url="http://GATEWAY/v1",
    api_key="gw-YOUR_KEY"
)

response = client.chat.completions.create(
    model="openai/gpt-5.4",
    messages=[
        {"role": "system", "content": "You are helpful."},
        {"role": "user", "content": [
            {"type": "text", "text": "What is in this image?"},
            {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
        ]}
    ],
    max_completion_tokens=1024,
    stream=True
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
Responses API — OpenAI-compatible universal LLM API /v1/responses

Stateless /v1/responses works with every LLM provider. The gateway uses the native Responses endpoint where available (OpenAI, Groq, xAI, Bailian, Codex) and translates to Anthropic Messages or OpenAI Chat Completions behind the scenes for the rest. Stateful usage — threading reasoning continuity across turns via encrypted_content or Anthropic thinking blocks — is only preserved on providers that natively support it (OpenAI, OpenAI Codex, Anthropic, Claude Code); see the model list for the Stateful badge.

curl -X POST http://GATEWAY/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer gw-YOUR_KEY" \
  -d '{
    "model": "anthropic/claude-sonnet-4-6",
    "input": [
      {
        "role": "user",
        "content": [
          {"type": "input_text", "text": "Describe this image in detail."},
          {"type": "input_image", "image_url": "https://example.com/photo.jpg"}
        ]
      }
    ]
  }'
from openai import OpenAI

client = OpenAI(
    base_url="http://GATEWAY/v1",
    api_key="gw-YOUR_KEY"
)

response = client.responses.create(
    model="anthropic/claude-sonnet-4-6",
    input=[
        {
            "role": "user",
            "content": [
                {"type": "input_text", "text": "Describe this image in detail."},
                {"type": "input_image", "image_url": "https://example.com/photo.jpg"}
            ]
        }
    ]
)

print(response.output_text)
Anthropic Messages API — Anthropic-compatible universal LLM API /v1/messages

Works with every LLM provider. Native Anthropic and Claude Code requests pass through byte-for-byte (preserving signed thinking blocks); everything else is translated into the provider's native format and streamed back as Anthropic SSE. Use this when your client speaks the Anthropic contract.

curl -X POST http://GATEWAY/v1/messages \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer gw-YOUR_KEY" \
  -d '{
    "model": "anthropic/claude-sonnet-4-6",
    "system": "You are helpful.",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Explain what this gateway does."}
    ],
    "stream": true
  }'
import httpx

resp = httpx.post(
    "http://GATEWAY/v1/messages",
    headers={
        "Authorization": "Bearer gw-YOUR_KEY",
        "Content-Type": "application/json",
    },
    json={
        "model": "anthropic/claude-sonnet-4-6",
        "system": "You are helpful.",
        "max_tokens": 1024,
        "messages": [
            {"role": "user", "content": "Explain what this gateway does."}
        ]
    },
)

print(resp.json())
Speech to Text /v1/audio/transcriptions

Transcribe audio files using OpenAI Whisper models. Supports mp3, mp4, mpeg, mpga, m4a, wav, and webm.

curl -X POST http://GATEWAY/v1/audio/transcriptions \
  -H "Authorization: Bearer gw-YOUR_KEY" \
  -F file=@audio.mp3 \
  -F model=openai/gpt-4o-transcribe
from openai import OpenAI

client = OpenAI(
    base_url="http://GATEWAY/v1",
    api_key="gw-YOUR_KEY"
)

with open("audio.mp3", "rb") as f:
    transcript = client.audio.transcriptions.create(
        model="openai/gpt-4o-transcribe",
        file=f
    )

print(transcript.text)
Text to Speech /v1/audio/speech

Convert text to natural-sounding speech using OpenAI TTS models. Voices: alloy, echo, fable, onyx, nova, shimmer.

curl -X POST http://GATEWAY/v1/audio/speech \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer gw-YOUR_KEY" \
  -d '{
    "model": "openai/tts-1",
    "input": "Hello, world! Welcome to Token Station.",
    "voice": "alloy"
  }' --output speech.mp3
from openai import OpenAI

client = OpenAI(
    base_url="http://GATEWAY/v1",
    api_key="gw-YOUR_KEY"
)

response = client.audio.speech.create(
    model="openai/tts-1",
    input="Hello, world! Welcome to Token Station.",
    voice="alloy"
)

response.stream_to_file("speech.mp3")
Image Generation /v1/images/generations

Generate images from text prompts. Supports OpenAI DALL-E / GPT Image, Google Imagen, and xAI Grok.

curl -X POST http://GATEWAY/v1/images/generations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer gw-YOUR_KEY" \
  -d '{
    "model": "openai/gpt-image-1.5",
    "prompt": "A sunset over mountains, oil painting style",
    "n": 1,
    "size": "1024x1024"
  }'
from openai import OpenAI

client = OpenAI(
    base_url="http://GATEWAY/v1",
    api_key="gw-YOUR_KEY"
)

response = client.images.generate(
    model="openai/gpt-image-1.5",
    prompt="A sunset over mountains, oil painting style",
    n=1,
    size="1024x1024"
)

print(response.data[0].url)
Video Generation /v1/video/generations

Generate videos from text or images. Supports Gemini Veo, Kling, Bailian Wan, BytePlus Seedance, xAI Grok, and OpenAI Sora. The gateway handles async polling internally.

# Text-to-video
curl -X POST http://GATEWAY/v1/video/generations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer gw-YOUR_KEY" \
  -d '{
    "model": "gemini/veo-3.1-generate-preview",
    "prompt": "A timelapse of a flower blooming in a garden",
    "aspect_ratio": "16:9"
  }'

# Image-to-video
curl -X POST http://GATEWAY/v1/video/generations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer gw-YOUR_KEY" \
  -d '{
    "model": "kling/kling-v3-0",
    "prompt": "The character turns and walks away",
    "image": "https://example.com/photo.jpg",
    "duration": 5,
    "aspect_ratio": "16:9"
  }'
import openai

client = openai.OpenAI(
    base_url="http://GATEWAY/v1",
    api_key="gw-YOUR_KEY"
)

# Text-to-video (custom endpoint)
response = client.post(
    "/v1/video/generations",
    body={
        "model": "gemini/veo-3.1-generate-preview",
        "prompt": "A timelapse of a flower blooming in a garden",
        "aspect_ratio": "16:9"
    },
    cast_to=object
)

print(response)
Claude Code via Token Station

Claude Code speaks Anthropic-style APIs. Point it at Token Station's Anthropic-compatible /v1/messages surface and choose either real Anthropic models or translated OpenAI models with environment variables.

export ANTHROPIC_BASE_URL="https://models.bytefuture.ai"
export ANTHROPIC_AUTH_TOKEN="YOUR TOKEN AT TOKEN STATION"

export ANTHROPIC_DEFAULT_OPUS_MODEL="anthropic/claude-opus-4-7"
export ANTHROPIC_DEFAULT_SONNET_MODEL="anthropic/claude-sonnet-4-6"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="anthropic/claude-haiku-4-5"
export CLAUDE_CODE_SUBAGENT_MODEL="anthropic/claude-sonnet-4-6"

claude -p "Respond with exactly the word: pong"
export ANTHROPIC_BASE_URL="https://models.bytefuture.ai"
export ANTHROPIC_AUTH_TOKEN="YOUR TOKEN AT TOKEN STATION"

export ANTHROPIC_DEFAULT_OPUS_MODEL="openai/gpt-5.4"
export ANTHROPIC_DEFAULT_SONNET_MODEL="openai/gpt-5.4-mini"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="openai/gpt-5.4-nano"
export CLAUDE_CODE_SUBAGENT_MODEL="openai/gpt-5.4"

claude -p "Respond with exactly the word: pong"
Codex via Token Station

Codex uses its own ~/.codex/config.toml provider configuration. For a custom provider, current Codex expects wire_api = "responses". Point Codex at Token Station's OpenAI-compatible base URL and supply your gateway token through the configured environment variable.

mkdir -p ~/.codex
cat > ~/.codex/config.toml <<'EOF'
model = "openai/gpt-5.4"
model_provider = "token_station"

[model_providers.token_station]
name = "Token Station"
base_url = "https://models.bytefuture.ai/v1"
env_key = "TOKEN_STATION_API_KEY"
wire_api = "responses"
EOF

export TOKEN_STATION_API_KEY="YOUR TOKEN AT TOKEN STATION"

codex exec "Respond with exactly the word: pong"
mkdir -p ~/.codex
cat > ~/.codex/config.toml <<'EOF'
model = "anthropic/claude-sonnet-4-6"
model_provider = "token_station"

[model_providers.token_station]
name = "Token Station"
base_url = "https://models.bytefuture.ai/v1"
env_key = "TOKEN_STATION_API_KEY"
wire_api = "responses"
EOF

export TOKEN_STATION_API_KEY="YOUR TOKEN AT TOKEN STATION"

codex exec "Respond with exactly the word: pong"