Claude Models
Use Claude Opus, Sonnet, Haiku, and Thinking models through BeansAI.
Overview
Claude models work through the OpenAI-compatible POST /chat/completions endpoint and the Anthropic-native POST /messages endpoint. Use the same BeansAI API key in either style.
For most apps, start with claude-sonnet-4-6. Move to Fable or Opus for harder reasoning or longer documents, and use Haiku for fast background work.
Model guide
| Model | ID | Best for | Context / output |
|---|---|---|---|
| Claude Fable 5 | claude-fable-5 | Anthropic's most capable widely released model for demanding reasoning, long-horizon agents, and large-context work. | 1M / 128K |
| Claude Opus 4.8 | claude-opus-4-8 | Highest-capability Claude model for deep research, large codebases, long documents, and high-stakes reviews. | 1M / 128K |
| Claude Opus 4.7 | claude-opus-4-7 | High-capability Opus for complex reasoning, adaptive thinking, and long-context agent work. | 1M / 128K |
| Claude Opus 4.6 | claude-opus-4-6 | Premium reasoning and writing when you want Opus quality with a stable general-purpose profile. | 1M / 128K |
| Claude Opus 4.6 (Thinking) | claude-opus-4-6-thinking | Hard reasoning tasks that benefit from explicit thinking budget, such as debugging, planning, and multi-step analysis. | 200K / 64K |
| Claude Sonnet 4.6 | claude-sonnet-4-6 | Default Claude choice for coding, agents, chat products, tool use, and balanced latency/cost. | 200K / 64K |
| Claude Haiku 4.5 | claude-haiku-4-5-20251001 | Fast, lower-cost Claude for classification, extraction, lightweight support replies, and background jobs. | 200K / 64K |
OpenAI-compatible
Use the official OpenAI SDK or raw HTTP. Pick a Claude model by changing only the model field.
curl https://api.beansai.dev/v1/chat/completions \
-H "Authorization: Bearer sk-beans-..." \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"messages": [
{"role": "user", "content": "Write a small TypeScript utility and tests."}
],
"stream": false
}'from openai import OpenAI
client = OpenAI(
api_key="sk-beans-...",
base_url="https://api.beansai.dev/v1",
)
response = client.chat.completions.create(
model="claude-sonnet-4-6",
messages=[
{"role": "user", "content": "Explain this error and propose a fix."}
],
max_tokens=4096,
)
print(response.choices[0].message.content)Anthropic Messages
Anthropic-native clients can call POST /messages. Use x-api-key for the BeansAI key; Authorization: Bearer also works.
curl https://api.beansai.dev/v1/messages \
-H "x-api-key: sk-beans-..." \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-fable-5",
"max_tokens": 4096,
"messages": [
{"role": "user", "content": "Analyze this design and list the top risks."}
]
}'Thinking
Use claude-opus-4-6-thinking when you need explicit reasoning budget. In OpenAI-compatible requests, send reasoning_effort. In Anthropic-native requests, send a thinking block.
curl https://api.beansai.dev/v1/messages \
-H "x-api-key: sk-beans-..." \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-opus-4-6-thinking",
"max_tokens": 12000,
"thinking": {
"type": "enabled",
"budget_tokens": 8192
},
"messages": [
{"role": "user", "content": "Debug this incident from logs and propose the safest rollback."}
]
}'Model examples
These JSON bodies can be pasted into the curl example above.
Claude Fable 5
claude-fable-5Anthropic's most capable widely released model for demanding reasoning, long-horizon agents, and large-context work.
{
"model": "claude-fable-5",
"messages": [
{
"role": "user",
"content": "Analyze this multi-month migration, identify hidden risks, and produce an execution plan with fallback points."
}
],
"max_tokens": 4096,
"stream": false
}Claude Opus 4.8
claude-opus-4-8Highest-capability Claude model for deep research, large codebases, long documents, and high-stakes reviews.
{
"model": "claude-opus-4-8",
"messages": [
{
"role": "user",
"content": "Review this architecture, identify failure modes, and propose a phased migration plan."
}
],
"max_tokens": 4096,
"stream": false
}Claude Opus 4.7
claude-opus-4-7High-capability Opus for complex reasoning, adaptive thinking, and long-context agent work.
{
"model": "claude-opus-4-7",
"messages": [
{
"role": "user",
"content": "Compare these migration options, identify risks, and recommend the safest sequence."
}
],
"max_tokens": 4096,
"stream": false
}Claude Opus 4.6
claude-opus-4-6Premium reasoning and writing when you want Opus quality with a stable general-purpose profile.
{
"model": "claude-opus-4-6",
"messages": [
{
"role": "user",
"content": "Turn these product notes into a precise technical spec with risks and acceptance criteria."
}
],
"max_tokens": 4096,
"stream": false
}Claude Opus 4.6 (Thinking)
claude-opus-4-6-thinkingHard reasoning tasks that benefit from explicit thinking budget, such as debugging, planning, and multi-step analysis.
{
"model": "claude-opus-4-6-thinking",
"messages": [
{
"role": "user",
"content": "Trace this production bug from symptoms to root cause. Show assumptions, tests, and the smallest safe fix."
}
],
"max_tokens": 4096,
"stream": false,
"reasoning_effort": "high"
}Claude Sonnet 4.6
claude-sonnet-4-6Default Claude choice for coding, agents, chat products, tool use, and balanced latency/cost.
{
"model": "claude-sonnet-4-6",
"messages": [
{
"role": "user",
"content": "Implement this feature, keep the API contract stable, and summarize the changed files."
}
],
"max_tokens": 4096,
"stream": false
}Claude Haiku 4.5
claude-haiku-4-5-20251001Fast, lower-cost Claude for classification, extraction, lightweight support replies, and background jobs.
{
"model": "claude-haiku-4-5-20251001",
"messages": [
{
"role": "user",
"content": "Extract company name, contact email, urgency, and requested action from this message."
}
],
"max_tokens": 1024,
"stream": false
}Claude Code
After configuring Claude Code for BeansAI, switch models inside the CLI with /model.
/model claude-fable-5
/model claude-opus-4-8
/model claude-opus-4-7
/model claude-opus-4-6
/model claude-opus-4-6-thinking
/model claude-sonnet-4-6
/model claude-haiku-4-5-20251001Tips
- Keep
stream: truefor long Claude responses so clients receive tokens as they arrive. - Claude Fable 5 uses always-on adaptive thinking; control depth with effort rather than a manual thinking budget.
- Request cost is returned in
X-Request-Cost-Micro-Usd, and Anthropic rate-limit headers are forwarded when upstream provides them.