Documentation

Guides

Practical, code-first guides for the most common tasks — from your first API call to production-ready patterns.

⚡

Start here

Make your first API call in 2 minutes

Install the OpenAI SDK, point it at KodaAPI, and get a response from Claude, GPT-4o or Gemini — your choice — with a single configuration change.

1. Create account 2. Get API key 3. Set base URL 4. Call any model

Read Quickstart →

Core API

6 guides

💬

Chat completions

Send messages and receive model responses. Covers system prompts, multi-turn context, and the full request schema.

Beginner 5 min

🌊

Streaming responses

Receive tokens as they're generated using server-sent events. Eliminates the wait for long outputs and improves perceived latency.

Beginner 5 min

👁

Vision & image input

Send images alongside text to vision-capable models like Claude Sonnet, GPT-4o, and Gemini 2.5 Pro via URL or base64.

Beginner 8 min

🔄

Multi-turn conversations

Build a stateful chatbot by appending messages to a history array. Each API call is stateless — you manage the context window.

Intermediate 10 min

⚙️

Choosing a model

How to pick the right model for speed, cost, or capability. Use smart aliases like best and fast to always route to the optimal model.

Beginner 5 min

🎛️

Temperature & sampling

Control output randomness with temperature and top_p. When to use each, and recommended values for different task types.

Intermediate 5 min

Prompting

4 guides

📝

Writing effective system prompts

Set the model's persona, scope, and tone. How to constrain outputs and define task boundaries with precision.

Beginner 8 min Coming soon

🔢

Structured output & JSON mode

Get reliably formatted JSON from any model. Prompt patterns that work across providers without vendor-specific features.

Intermediate 10 min Coming soon

📚

Long context & document analysis

Process PDFs, codebases, or large documents with 1M+ context models like Gemini 2.5 Pro and Claude.

Intermediate 12 min Coming soon

🧠

Reasoning models (o3, o4-mini)

When to use reasoning models versus standard models. How thinking tokens work and how to interpret verbose chain-of-thought output.

Advanced 15 min Coming soon

Reliability & Production

4 guides

🔁

Error handling & retries

Handle 429, 502, and 500 errors gracefully with exponential backoff. Production-ready retry logic in Python and JavaScript.

Intermediate 10 min

💰

Cost optimisation

Reduce token spend with prompt compression, caching strategies, and smart model routing between cheap and capable models.

Intermediate 12 min Coming soon

🔀

Model fallback & routing

Automatically fall back to a different model when a provider is down. Use aliases to decouple your code from specific model versions.

Advanced 15 min Coming soon

🔒

API key security

Store keys safely in environment variables, rotate them on a schedule, and monitor for anomalous usage via the dashboard.

Beginner 8 min Coming soon

Integrations

6 guides

🤖

Claude Code

Route Claude Code's API calls through KodaAPI by setting ANTHROPIC_BASE_URL. Access all models from the CLI you already use.

Beginner 3 min

✏️

Cursor & Windsurf

Set a custom OpenAI base URL in your AI coding editor settings to use any KodaAPI model as your coding assistant.

Beginner 3 min

🦜

LangChain

Use ChatOpenAI with a custom openai_api_base to power chains, agents, and RAG pipelines with any model on KodaAPI.

Intermediate 8 min

🔗

Vercel AI SDK

Use the openai provider with a custom baseURL in Next.js, SvelteKit, or any framework supported by the Vercel AI SDK.

Intermediate 10 min Coming soon

🐍

LlamaIndex

Configure LlamaIndex's OpenAI LLM class with KodaAPI's base URL to build document-aware agents and RAG pipelines.

Advanced 12 min Coming soon

⚡

Edge & serverless

Deploy AI-powered endpoints on Cloudflare Workers, Vercel Edge Functions, or AWS Lambda with streaming support.

Advanced 15 min Coming soon

Quick reference

◈

All models

25+ models with pricing, context windows, and capabilities

→

⌗

API reference

Full parameter reference for POST /v1/chat/completions

→

⚠

Error codes

HTTP status codes, error types, and how to handle them

→

≈

Model aliases

best · smart · fast · mini — always resolves to the right model

→

Can't find what you need?

Send us an email and we'll reply within one business day.

hello@kodaapi.com →