BeansAI
ModelsDocs
Sign inSign up
ModelsDocs
Sign inSign up

Quickstart

  • Overview

Clients

  • Claude Code
  • CC Switch
  • OpenClaw
  • Roo Code
  • OpenCode
  • Codex CLI
  • GPT Image 2
  • Seedance 2.0
  • SkyReels V4
  • Mureka Song
  • Cursor
  • Cherry Studio

Reference

  • Claude Models
  • Raw API
← Back to Docs

Claude Models

Use Claude Opus, Sonnet, Haiku, and Thinking models through BeansAI.

Overview

Claude models work through the OpenAI-compatible POST /chat/completions endpoint and the Anthropic-native POST /messages endpoint. Use the same BeansAI API key in either style.

For most apps, start with claude-sonnet-4-6. Move to Fable or Opus for harder reasoning or longer documents, and use Haiku for fast background work.

Model guide

ModelIDBest forContext / output
Claude Fable 5claude-fable-5Anthropic's most capable widely released model for demanding reasoning, long-horizon agents, and large-context work.1M / 128K
Claude Opus 4.8claude-opus-4-8Highest-capability Claude model for deep research, large codebases, long documents, and high-stakes reviews.1M / 128K
Claude Opus 4.7claude-opus-4-7High-capability Opus for complex reasoning, adaptive thinking, and long-context agent work.1M / 128K
Claude Opus 4.6claude-opus-4-6Premium reasoning and writing when you want Opus quality with a stable general-purpose profile.1M / 128K
Claude Opus 4.6 (Thinking)claude-opus-4-6-thinkingHard reasoning tasks that benefit from explicit thinking budget, such as debugging, planning, and multi-step analysis.200K / 64K
Claude Sonnet 4.6claude-sonnet-4-6Default Claude choice for coding, agents, chat products, tool use, and balanced latency/cost.200K / 64K
Claude Haiku 4.5claude-haiku-4-5-20251001Fast, lower-cost Claude for classification, extraction, lightweight support replies, and background jobs.200K / 64K

OpenAI-compatible

Use the official OpenAI SDK or raw HTTP. Pick a Claude model by changing only the model field.

shell
curl https://api.beansai.dev/v1/chat/completions \
  -H "Authorization: Bearer sk-beans-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [
      {"role": "user", "content": "Write a small TypeScript utility and tests."}
    ],
    "stream": false
  }'
python
from openai import OpenAI

client = OpenAI(
    api_key="sk-beans-...",
    base_url="https://api.beansai.dev/v1",
)

response = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[
        {"role": "user", "content": "Explain this error and propose a fix."}
    ],
    max_tokens=4096,
)

print(response.choices[0].message.content)

Anthropic Messages

Anthropic-native clients can call POST /messages. Use x-api-key for the BeansAI key; Authorization: Bearer also works.

shell
curl https://api.beansai.dev/v1/messages \
  -H "x-api-key: sk-beans-..." \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-fable-5",
    "max_tokens": 4096,
    "messages": [
      {"role": "user", "content": "Analyze this design and list the top risks."}
    ]
  }'

Thinking

Use claude-opus-4-6-thinking when you need explicit reasoning budget. In OpenAI-compatible requests, send reasoning_effort. In Anthropic-native requests, send a thinking block.

shell
curl https://api.beansai.dev/v1/messages \
  -H "x-api-key: sk-beans-..." \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-6-thinking",
    "max_tokens": 12000,
    "thinking": {
      "type": "enabled",
      "budget_tokens": 8192
    },
    "messages": [
      {"role": "user", "content": "Debug this incident from logs and propose the safest rollback."}
    ]
  }'

Model examples

These JSON bodies can be pasted into the curl example above.

Claude Fable 5

claude-fable-5

Anthropic's most capable widely released model for demanding reasoning, long-horizon agents, and large-context work.

json
{
  "model": "claude-fable-5",
  "messages": [
    {
      "role": "user",
      "content": "Analyze this multi-month migration, identify hidden risks, and produce an execution plan with fallback points."
    }
  ],
  "max_tokens": 4096,
  "stream": false
}

Claude Opus 4.8

claude-opus-4-8

Highest-capability Claude model for deep research, large codebases, long documents, and high-stakes reviews.

json
{
  "model": "claude-opus-4-8",
  "messages": [
    {
      "role": "user",
      "content": "Review this architecture, identify failure modes, and propose a phased migration plan."
    }
  ],
  "max_tokens": 4096,
  "stream": false
}

Claude Opus 4.7

claude-opus-4-7

High-capability Opus for complex reasoning, adaptive thinking, and long-context agent work.

json
{
  "model": "claude-opus-4-7",
  "messages": [
    {
      "role": "user",
      "content": "Compare these migration options, identify risks, and recommend the safest sequence."
    }
  ],
  "max_tokens": 4096,
  "stream": false
}

Claude Opus 4.6

claude-opus-4-6

Premium reasoning and writing when you want Opus quality with a stable general-purpose profile.

json
{
  "model": "claude-opus-4-6",
  "messages": [
    {
      "role": "user",
      "content": "Turn these product notes into a precise technical spec with risks and acceptance criteria."
    }
  ],
  "max_tokens": 4096,
  "stream": false
}

Claude Opus 4.6 (Thinking)

claude-opus-4-6-thinking

Hard reasoning tasks that benefit from explicit thinking budget, such as debugging, planning, and multi-step analysis.

json
{
  "model": "claude-opus-4-6-thinking",
  "messages": [
    {
      "role": "user",
      "content": "Trace this production bug from symptoms to root cause. Show assumptions, tests, and the smallest safe fix."
    }
  ],
  "max_tokens": 4096,
  "stream": false,
  "reasoning_effort": "high"
}

Claude Sonnet 4.6

claude-sonnet-4-6

Default Claude choice for coding, agents, chat products, tool use, and balanced latency/cost.

json
{
  "model": "claude-sonnet-4-6",
  "messages": [
    {
      "role": "user",
      "content": "Implement this feature, keep the API contract stable, and summarize the changed files."
    }
  ],
  "max_tokens": 4096,
  "stream": false
}

Claude Haiku 4.5

claude-haiku-4-5-20251001

Fast, lower-cost Claude for classification, extraction, lightweight support replies, and background jobs.

json
{
  "model": "claude-haiku-4-5-20251001",
  "messages": [
    {
      "role": "user",
      "content": "Extract company name, contact email, urgency, and requested action from this message."
    }
  ],
  "max_tokens": 1024,
  "stream": false
}

Claude Code

After configuring Claude Code for BeansAI, switch models inside the CLI with /model.

Claude Code
/model claude-fable-5
/model claude-opus-4-8
/model claude-opus-4-7
/model claude-opus-4-6
/model claude-opus-4-6-thinking
/model claude-sonnet-4-6
/model claude-haiku-4-5-20251001

Tips

  • Keep stream: true for long Claude responses so clients receive tokens as they arrive.
  • Claude Fable 5 uses always-on adaptive thinking; control depth with effort rather than a manual thinking budget.
  • Request cost is returned in X-Request-Cost-Micro-Usd, and Anthropic rate-limit headers are forwarded when upstream provides them.