Introduction

SharedLLM is an OpenAI- and Anthropic-compatible gateway. Point your existing SDK at our base URL, add one header, and your requests are routed through the cheapest available key — your own contributed provider keys first, then the shared community pool.

You don't change your code beyond the base URL and a single header. The request and response shapes are exactly what each provider returns.

Base URL: https://api.sharedllm.com

Quickstart

Create a virtual key in the API Keys page (it looks like sk-sharedllm-…), then call any supported endpoint.

cURL — OpenAI-compatible
curl https://api.sharedllm.com/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-SharedLLM-Key: sk-sharedllm-YOUR_KEY" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
Python — OpenAI SDK
from openai import OpenAI

client = OpenAI(
    base_url="https://api.sharedllm.com/openai/v1",
    api_key="unused",  # provider key is injected by the gateway
    default_headers={"X-SharedLLM-Key": "sk-sharedllm-YOUR_KEY"},
)

resp = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)

Authentication

Every request carries your virtual key in the X-SharedLLM-Key header. This identifies you for rate limiting, billing, and key routing — it is never your provider key.

The provider's own Authorization header is optional. If you omit it, the gateway injects a contributed or pooled key on your behalf. If you supply your own, it's used as-is (handy for the CLI and provider-native tools).

Keep your virtual key secret. Revoke and rotate it any time from the API Keys page.

Endpoints

Append the provider-native path to the base URL:

OpenAI-compatiblehttps://api.sharedllm.com/openai/v1/chat/completions
Anthropic Messageshttps://api.sharedllm.com/anthropic/v1/messages
Ollamahttps://api.sharedllm.com/ollama/api/chat

The path after the provider segment is forwarded verbatim, so anything the upstream API supports (streaming, tools, vision, embeddings) works unchanged.

cURL — Anthropic Messages
curl https://api.sharedllm.com/anthropic/v1/messages \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -H "X-SharedLLM-Key: sk-sharedllm-YOUR_KEY" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Custom endpoints

Already running an OpenAI- or Anthropic-compatible server somewhere else? Add it to your Resource Pool as a custom endpoint and route through it just like a built-in provider.

Custom OpenAI-compatiblehttps://api.sharedllm.com/custom-openai/v1/chat/completions
Custom Anthropic-compatiblehttps://api.sharedllm.com/custom-anthropic/v1/messages

When contributing a custom endpoint you provide its base URL (the root, e.g. https://my-host.example.com) and your key. The gateway appends the incoming path to that root at request time.

Base URLs are validated against SSRF: localhost, cloud metadata addresses, and private network ranges are rejected. You also choose per-key whether to share to the pool (earning credits) or keep it private to your account.

Models

The Models catalog is synced hourly from each provider's models endpoint. Each entry shows input/output price per 1M tokens, context window, max output, and how many pool keys are currently serving that model.

Pass the model id exactly as the provider names it (e.g. gpt-4o-mini, claude-3-5-sonnet-20241022) in the request body.

Resource pool & credits

The pool is the heart of SharedLLM. Routing prefers, in order:

  1. Your own contributed keys (no balance charge).
  2. The shared community pool (debited from your USD balance).

When someone else's request is served by a key you shared, you earn credits. If an upstream key returns 401/402/403/429, the gateway transparently retries the next available key — your request still succeeds.

Billing & tiers

Balances are held in USD and topped up from the Usage & Billing page. You're only debited when a request is served by a pool key; requests served by your own contributed keys are free.

TierDaily requestsMax keysContributed keys
Free1,00021
Pro50,000105
Team200,0005020
EnterpriseUnlimited200100

Rate limits

Each virtual key has per-minute request (RPM) and token (TPM) limits, plus a daily request cap set by your tier. Live consumption is shown as quota bars on the Usage page; the daily counters reset at 00:00 UTC.

Exceeding a limit returns HTTP 429 with a descriptive body — back off and retry, or upgrade your tier for higher ceilings.