Introduction
SharedLLM is an OpenAI- and Anthropic-compatible gateway. Point your existing SDK at our base URL, add one header, and your requests are routed through the cheapest available key — your own contributed provider keys first, then the shared community pool.
You don't change your code beyond the base URL and a single header. The request and response shapes are exactly what each provider returns.
https://api.sharedllm.comQuickstart
Create a virtual key in the API Keys page (it looks like sk-sharedllm-…), then call any supported endpoint.
curl https://api.sharedllm.com/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-SharedLLM-Key: sk-sharedllm-YOUR_KEY" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello!"}]
}'from openai import OpenAI
client = OpenAI(
base_url="https://api.sharedllm.com/openai/v1",
api_key="unused", # provider key is injected by the gateway
default_headers={"X-SharedLLM-Key": "sk-sharedllm-YOUR_KEY"},
)
resp = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)Authentication
Every request carries your virtual key in the X-SharedLLM-Key header. This identifies you for rate limiting, billing, and key routing — it is never your provider key.
The provider's own Authorization header is optional. If you omit it, the gateway injects a contributed or pooled key on your behalf. If you supply your own, it's used as-is (handy for the CLI and provider-native tools).
Endpoints
Append the provider-native path to the base URL:
| OpenAI-compatible | https://api.sharedllm.com/openai/v1/chat/completions |
| Anthropic Messages | https://api.sharedllm.com/anthropic/v1/messages |
| Ollama | https://api.sharedllm.com/ollama/api/chat |
The path after the provider segment is forwarded verbatim, so anything the upstream API supports (streaming, tools, vision, embeddings) works unchanged.
curl https://api.sharedllm.com/anthropic/v1/messages \
-H "Content-Type: application/json" \
-H "anthropic-version: 2023-06-01" \
-H "X-SharedLLM-Key: sk-sharedllm-YOUR_KEY" \
-d '{
"model": "claude-3-5-sonnet-20241022",
"max_tokens": 256,
"messages": [{"role": "user", "content": "Hello!"}]
}'Custom endpoints
Already running an OpenAI- or Anthropic-compatible server somewhere else? Add it to your Resource Pool as a custom endpoint and route through it just like a built-in provider.
| Custom OpenAI-compatible | https://api.sharedllm.com/custom-openai/v1/chat/completions |
| Custom Anthropic-compatible | https://api.sharedllm.com/custom-anthropic/v1/messages |
When contributing a custom endpoint you provide its base URL (the root, e.g. https://my-host.example.com) and your key. The gateway appends the incoming path to that root at request time.
Models
The Models catalog is synced hourly from each provider's models endpoint. Each entry shows input/output price per 1M tokens, context window, max output, and how many pool keys are currently serving that model.
Pass the model id exactly as the provider names it (e.g. gpt-4o-mini, claude-3-5-sonnet-20241022) in the request body.
Resource pool & credits
The pool is the heart of SharedLLM. Routing prefers, in order:
- Your own contributed keys (no balance charge).
- The shared community pool (debited from your USD balance).
When someone else's request is served by a key you shared, you earn credits. If an upstream key returns 401/402/403/429, the gateway transparently retries the next available key — your request still succeeds.
Billing & tiers
Balances are held in USD and topped up from the Usage & Billing page. You're only debited when a request is served by a pool key; requests served by your own contributed keys are free.
| Tier | Daily requests | Max keys | Contributed keys |
|---|---|---|---|
| Free | 1,000 | 2 | 1 |
| Pro | 50,000 | 10 | 5 |
| Team | 200,000 | 50 | 20 |
| Enterprise | Unlimited | 200 | 100 |
Rate limits
Each virtual key has per-minute request (RPM) and token (TPM) limits, plus a daily request cap set by your tier. Live consumption is shown as quota bars on the Usage page; the daily counters reset at 00:00 UTC.
Exceeding a limit returns HTTP 429 with a descriptive body — back off and retry, or upgrade your tier for higher ceilings.