Raw API
Direct HTTP against the OpenAI-compatible endpoint. Works with any language.
Overview
The base URL is https://api.beansai.dev/v1. Authenticate with a Bearer token in theAuthorization header. Endpoints mirror OpenAI:
POST /chat/completions— chat, streaming, tool callingGET /models— list available models
curl
shell
curl https://api.beansai.dev/v1/chat/completions \
-H "Authorization: Bearer sk-beans-..." \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.5",
"messages": [
{"role": "user", "content": "Hello, world!"}
],
"stream": false
}'Basic usage
Use the official OpenAI SDKs and just swap the base URL.
python
from openai import OpenAI
client = OpenAI(
api_key="sk-beans-...",
base_url="https://api.beansai.dev/v1",
)
res = client.chat.completions.create(
model="gpt-5.5",
messages=[{"role": "user", "content": "Hello!"}],
)
print(res.choices[0].message.content)javascript
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "sk-beans-...",
baseURL: "https://api.beansai.dev/v1",
});
const res = await client.chat.completions.create({
model: "gpt-5.5",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(res.choices[0]?.message?.content);Streaming
shell
curl https://api.beansai.dev/v1/chat/completions \
-N \
-H "Authorization: Bearer sk-beans-..." \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.5",
"messages": [{"role": "user", "content": "Count to 5"}],
"stream": true
}'SSE events follow the data: {json}\n\n format; final event is data: [DONE].
List models
shell
curl https://api.beansai.dev/v1/models \
-H "Authorization: Bearer sk-beans-..."Tips
- Rate limits are per-key and returned in
X-RateLimit-*headers. - On 429 or 5xx, retry with exponential backoff. BeansAI's load balancer will pick a healthy upstream.
- Cost per request is returned in
X-Request-Cost-Micro-Usd. Great for real-time billing UIs.