Quickstart — KodaAPI Docs

Introduction

KodaAPI is an AI API gateway that gives you unified access to 100+ large language models from multiple providers — Anthropic Claude, OpenAI GPT, Google Gemini, DeepSeek, xAI Grok, Moonshot Kimi, Alibaba Qwen, Z.AI GLM and more — through a single endpoint.

One key — a single API key works for every model
One endpoint — https://kodaapi.com/v1, fully compatible with the OpenAI SDK
One balance — points deducted per token, no provider-by-provider billing
Smart aliases — use best, fast, smart to always get the right model

OpenAI-compatible

KodaAPI implements the OpenAI Chat Completions API (/v1/chat/completions) and Images API (/v1/images/generations). Any library that supports a custom base_url works out of the box.

Prerequisites

Create an account

Go to kodaapi.com/portal and register with your email, or sign in with Google. New accounts receive 500 free points to get started.

Create an API key

In the dashboard, go to API Keys → Create key. Give it a name (e.g. Development) and click Create. Copy the key immediately — it is only shown once.

Keep your key secret

Never commit API keys to source control or expose them in client-side code. Store them as environment variables.

Set the environment variable

Shell

export KODA_API_KEY="your-api-key-here"

First request

The simplest way to test your key is with curl:

cURL

curl https://kodaapi.com/v1/chat/completions \
  -H "Authorization: Bearer $KODA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "smart",
    "messages": [
      { "role": "user", "content": "Hello! What can you do?" }
    ]
  }'

You'll get a JSON response with the model's reply. The model field in the response shows which model the alias resolved to.

Python

Install the official OpenAI Python library:

Shell

pip install openai

Basic completion

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["KODA_API_KEY"],
    base_url="https://kodaapi.com/v1",
)

response = client.chat.completions.create(
    model="smart",  # or "claude-sonnet-4-6", "gpt-4o", etc.
    messages=[
        {"role": "user", "content": "Explain quantum entanglement simply."}
    ],
)

print(response.choices[0].message.content)

Streaming

Python

stream = client.chat.completions.create(
    model="fast",
    messages=[{"role": "user", "content": "Write a haiku about APIs."}],
    stream=True,
)

for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

With a system prompt

Python

response = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[
        {"role": "system", "content": "You are a concise assistant. Reply in 1-2 sentences."},
        {"role": "user",   "content": "What is the capital of Vietnam?"},
    ],
    max_tokens=100,
    temperature=0.7,
)

JavaScript / Node.js

Install the OpenAI Node.js library:

Shell

npm install openai

Basic completion

JavaScript

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.KODA_API_KEY,
  baseURL: "https://kodaapi.com/v1",
});

const response = await client.chat.completions.create({
  model: "smart",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(response.choices[0].message.content);

Streaming

JavaScript

const stream = await client.chat.completions.create({
  model: "fast",
  messages: [{ role: "user", content: "Tell me a short story." }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

Multi-turn conversation

JavaScript

const messages = [
  { role: "system",    content: "You are a helpful coding assistant." },
  { role: "user",      content: "Write a function that reverses a string in Python." },
];

const first = await client.chat.completions.create({ model: "smart", messages });
messages.push(first.choices[0].message);

// Continue the conversation
messages.push({ role: "user", content: "Now add type hints." });
const second = await client.chat.completions.create({ model: "smart", messages });
console.log(second.choices[0].message.content);

cURL

cURL is useful for quick tests and shell scripts.

Chat completion

cURL

curl https://kodaapi.com/v1/chat/completions \
  -H "Authorization: Bearer $KODA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-flash",
    "messages": [{ "role": "user", "content": "What is 2+2?" }],
    "max_tokens": 50
  }'

Streaming with cURL

cURL

curl https://kodaapi.com/v1/chat/completions \
  -H "Authorization: Bearer $KODA_API_KEY" \
  -H "Content-Type: application/json" \
  --no-buffer \
  -d '{
    "model": "fast",
    "messages": [{ "role": "user", "content": "Count to 5." }],
    "stream": true
  }'

Any OpenAI-compatible client

Any tool or library that lets you set a custom base_url works with KodaAPI. Just point it at https://kodaapi.com/v1.

Shell

# Set environment variables before launching Claude Code
export ANTHROPIC_BASE_URL="https://kodaapi.com/v1"
export ANTHROPIC_API_KEY="your-koda-api-key"
claude

Settings

# Cursor / Windsurf → Settings → AI → Override OpenAI Base URL

Base URL: https://kodaapi.com/v1
API Key:  your-koda-api-key

Python

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="smart",
    openai_api_key="your-koda-api-key",
    openai_api_base="https://kodaapi.com/v1",
)

result = llm.invoke("Explain RAG in one paragraph.")
print(result.content)

JavaScript

const res = await fetch("https://kodaapi.com/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.KODA_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "fast",
    messages: [{ role: "user", content: "Hello!" }],
  }),
});

const data = await res.json();
console.log(data.choices[0].message.content);

Models & aliases

You can use any full model ID, or one of the smart aliases that automatically resolve to the best available model for that use case.

Smart aliases

Alias	Resolves to	Best for
`best`	`claude-opus-4-8`	Most capable
`smart`	`claude-sonnet-4-6`	Balanced
`fast`	`gemini-3.1-flash-lite`	Speed & cost
`mini`	`gemini-2.0-flash-lite`	Cheapest

Supported providers

Anthropic Claude Opus · Sonnet · Haiku

OpenAI GPT-4o · o3 · o4-mini

Google Gemini 2.5 Pro · 3.5 Flash

DeepSeek DeepSeek V4 Pro · Reasoner

xAI Grok 4.3 · Grok Build

Moonshot Kimi K2.5 · K2.6 · K2.7

Alibaba Qwen Max · Plus · VL

Z.AI GLM-5.1 · GLM-5V-Turbo

BytePlus Seed 2.0 Pro · Lite · Mini

MiniMax MiniMax M3 · Image-01

Meta & Mistral Llama 4 · Mistral Large

Popular model IDs

Model ID	Provider	Context
`claude-opus-4-8`	Anthropic	200k
`claude-sonnet-4-6`	Anthropic	200k
`gpt-4o`	OpenAI	128k
`o3`	OpenAI	200k
`gemini-2.5-pro`	Google	1M
`gemini-3.5-flash`	Google	1M
`deepseek-chat`	DeepSeek	64k
`deepseek-reasoner`	DeepSeek	64k
`grok-4.3`	xAI	131k
`kimi-k2.5`	Moonshot	128k
`qwen-max`	Alibaba	32k
`glm-5.1`	Z.AI	128k
`seed-2-0-pro-260328`	BytePlus	32k
`llama-4-maverick`	Meta	128k
`MiniMax-M3`	MiniMax	1M
`mistral-large-latest`	Mistral	128k

See the full list at kodaapi.com/models →

Streaming

Add "stream": true to receive server-sent events (SSE) as the model generates tokens, instead of waiting for the full response.

The response format follows the OpenAI streaming spec: each SSE event contains a delta with a partial content string. The stream ends with data: [DONE].

Tip

Streaming is especially valuable for long responses or interactive UI — users see output immediately instead of waiting several seconds for the full reply.

Error codes

HTTP status	Error type	Meaning
`401`	`auth_error`	Missing or invalid API key
`402`	`payment_error`	Insufficient balance — top up your account
`400`	`validation_error`	Malformed request body
`404`	`not_found`	Model ID not found
`429`	`rate_limit`	Too many requests — slow down
`502`	`provider_error`	Upstream model provider returned an error
`500`	`server_error`	Internal error — try again

All errors follow this JSON shape:

JSON

{
  "error": {
    "message": "Insufficient balance",
    "type": "payment_error"
  }
}

Next steps

◈

Browse all models

100+ models from Anthropic, OpenAI, Google, DeepSeek, xAI, Moonshot and more.

⌗

Manage API keys

Create, revoke, and monitor usage in the dashboard.

◈

Usage & billing

Track token usage and points balance by key and model.

◇

Get support

Email us at hello@kodaapi.com — we reply within 24 hours.