Claude Models

Use Claude Opus, Sonnet, Haiku, and Thinking models through BeansAI.

Overview

Claude models work through the OpenAI-compatible POST /chat/completions endpoint and the Anthropic-native POST /messages endpoint. Use the same BeansAI API key in either style.

For most apps, start with claude-sonnet-4-6. Move to Fable or Opus for harder reasoning or longer documents, and use Haiku for fast background work.

Model guide

Model	ID	Best for	Context / output
Claude Fable 5	`claude-fable-5`	Anthropic's most capable widely released model for demanding reasoning, long-horizon agents, and large-context work.	1M / 128K
Claude Opus 4.8	`claude-opus-4-8`	Highest-capability Claude model for deep research, large codebases, long documents, and high-stakes reviews.	1M / 128K
Claude Opus 4.7	`claude-opus-4-7`	High-capability Opus for complex reasoning, adaptive thinking, and long-context agent work.	1M / 128K
Claude Opus 4.6	`claude-opus-4-6`	Premium reasoning and writing when you want Opus quality with a stable general-purpose profile.	1M / 128K
Claude Opus 4.6 (Thinking)	`claude-opus-4-6-thinking`	Hard reasoning tasks that benefit from explicit thinking budget, such as debugging, planning, and multi-step analysis.	200K / 64K
Claude Sonnet 4.6	`claude-sonnet-4-6`	Default Claude choice for coding, agents, chat products, tool use, and balanced latency/cost.	200K / 64K
Claude Haiku 4.5	`claude-haiku-4-5-20251001`	Fast, lower-cost Claude for classification, extraction, lightweight support replies, and background jobs.	200K / 64K

OpenAI-compatible

Use the official OpenAI SDK or raw HTTP. Pick a Claude model by changing only the model field.

shell

curl https://api.beansai.dev/v1/chat/completions \
  -H "Authorization: Bearer sk-beans-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [
      {"role": "user", "content": "Write a small TypeScript utility and tests."}
    ],
    "stream": false
  }'

python

from openai import OpenAI

client = OpenAI(
    api_key="sk-beans-...",
    base_url="https://api.beansai.dev/v1",
)

response = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[
        {"role": "user", "content": "Explain this error and propose a fix."}
    ],
    max_tokens=4096,
)

print(response.choices[0].message.content)

Anthropic Messages

Anthropic-native clients can call POST /messages. Use x-api-key for the BeansAI key; Authorization: Bearer also works.

shell

curl https://api.beansai.dev/v1/messages \
  -H "x-api-key: sk-beans-..." \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-fable-5",
    "max_tokens": 4096,
    "messages": [
      {"role": "user", "content": "Analyze this design and list the top risks."}
    ]
  }'

Thinking

Use claude-opus-4-6-thinking when you need explicit reasoning budget. In OpenAI-compatible requests, send reasoning_effort. In Anthropic-native requests, send a thinking block.

shell

curl https://api.beansai.dev/v1/messages \
  -H "x-api-key: sk-beans-..." \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-6-thinking",
    "max_tokens": 12000,
    "thinking": {
      "type": "enabled",
      "budget_tokens": 8192
    },
    "messages": [
      {"role": "user", "content": "Debug this incident from logs and propose the safest rollback."}
    ]
  }'

Model examples

These JSON bodies can be pasted into the curl example above.

Claude Fable 5

claude-fable-5

Anthropic's most capable widely released model for demanding reasoning, long-horizon agents, and large-context work.

json

{
  "model": "claude-fable-5",
  "messages": [
    {
      "role": "user",
      "content": "Analyze this multi-month migration, identify hidden risks, and produce an execution plan with fallback points."
    }
  ],
  "max_tokens": 4096,
  "stream": false
}

Claude Opus 4.8

claude-opus-4-8

Highest-capability Claude model for deep research, large codebases, long documents, and high-stakes reviews.

json

{
  "model": "claude-opus-4-8",
  "messages": [
    {
      "role": "user",
      "content": "Review this architecture, identify failure modes, and propose a phased migration plan."
    }
  ],
  "max_tokens": 4096,
  "stream": false
}

Claude Opus 4.7

claude-opus-4-7

High-capability Opus for complex reasoning, adaptive thinking, and long-context agent work.

json

{
  "model": "claude-opus-4-7",
  "messages": [
    {
      "role": "user",
      "content": "Compare these migration options, identify risks, and recommend the safest sequence."
    }
  ],
  "max_tokens": 4096,
  "stream": false
}

Claude Opus 4.6

claude-opus-4-6

Premium reasoning and writing when you want Opus quality with a stable general-purpose profile.

json

{
  "model": "claude-opus-4-6",
  "messages": [
    {
      "role": "user",
      "content": "Turn these product notes into a precise technical spec with risks and acceptance criteria."
    }
  ],
  "max_tokens": 4096,
  "stream": false
}

Claude Opus 4.6 (Thinking)

claude-opus-4-6-thinking

Hard reasoning tasks that benefit from explicit thinking budget, such as debugging, planning, and multi-step analysis.

json

{
  "model": "claude-opus-4-6-thinking",
  "messages": [
    {
      "role": "user",
      "content": "Trace this production bug from symptoms to root cause. Show assumptions, tests, and the smallest safe fix."
    }
  ],
  "max_tokens": 4096,
  "stream": false,
  "reasoning_effort": "high"
}

Claude Sonnet 4.6

claude-sonnet-4-6

Default Claude choice for coding, agents, chat products, tool use, and balanced latency/cost.

json

{
  "model": "claude-sonnet-4-6",
  "messages": [
    {
      "role": "user",
      "content": "Implement this feature, keep the API contract stable, and summarize the changed files."
    }
  ],
  "max_tokens": 4096,
  "stream": false
}

Claude Haiku 4.5

claude-haiku-4-5-20251001

Fast, lower-cost Claude for classification, extraction, lightweight support replies, and background jobs.

json

{
  "model": "claude-haiku-4-5-20251001",
  "messages": [
    {
      "role": "user",
      "content": "Extract company name, contact email, urgency, and requested action from this message."
    }
  ],
  "max_tokens": 1024,
  "stream": false
}

Claude Code

After configuring Claude Code for BeansAI, switch models inside the CLI with /model.

Claude Code

/model claude-fable-5
/model claude-opus-4-8
/model claude-opus-4-7
/model claude-opus-4-6
/model claude-opus-4-6-thinking
/model claude-sonnet-4-6
/model claude-haiku-4-5-20251001

Tips

Keep stream: true for long Claude responses so clients receive tokens as they arrive.
Claude Fable 5 uses always-on adaptive thinking; control depth with effort rather than a manual thinking budget.
Request cost is returned in X-Request-Cost-Micro-Usd, and Anthropic rate-limit headers are forwarded when upstream provides them.

← Back to Docs

Claude Models

Use Claude Opus, Sonnet, Haiku, and Thinking models through BeansAI.

Overview

Claude models work through the OpenAI-compatible POST /chat/completions endpoint and the Anthropic-native POST /messages endpoint. Use the same BeansAI API key in either style.

For most apps, start with claude-sonnet-4-6. Move to Fable or Opus for harder reasoning or longer documents, and use Haiku for fast background work.

Model guide

Model	ID	Best for	Context / output
Claude Fable 5	`claude-fable-5`	Anthropic's most capable widely released model for demanding reasoning, long-horizon agents, and large-context work.	1M / 128K
Claude Opus 4.8	`claude-opus-4-8`	Highest-capability Claude model for deep research, large codebases, long documents, and high-stakes reviews.	1M / 128K
Claude Opus 4.7	`claude-opus-4-7`	High-capability Opus for complex reasoning, adaptive thinking, and long-context agent work.	1M / 128K
Claude Opus 4.6	`claude-opus-4-6`	Premium reasoning and writing when you want Opus quality with a stable general-purpose profile.	1M / 128K
Claude Opus 4.6 (Thinking)	`claude-opus-4-6-thinking`	Hard reasoning tasks that benefit from explicit thinking budget, such as debugging, planning, and multi-step analysis.	200K / 64K
Claude Sonnet 4.6	`claude-sonnet-4-6`	Default Claude choice for coding, agents, chat products, tool use, and balanced latency/cost.	200K / 64K
Claude Haiku 4.5	`claude-haiku-4-5-20251001`	Fast, lower-cost Claude for classification, extraction, lightweight support replies, and background jobs.	200K / 64K

OpenAI-compatible

Use the official OpenAI SDK or raw HTTP. Pick a Claude model by changing only the model field.

shell

curl https://api.beansai.dev/v1/chat/completions \
  -H "Authorization: Bearer sk-beans-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [
      {"role": "user", "content": "Write a small TypeScript utility and tests."}
    ],
    "stream": false
  }'

python

from openai import OpenAI

client = OpenAI(
    api_key="sk-beans-...",
    base_url="https://api.beansai.dev/v1",
)

response = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[
        {"role": "user", "content": "Explain this error and propose a fix."}
    ],
    max_tokens=4096,
)

print(response.choices[0].message.content)

Anthropic Messages

Anthropic-native clients can call POST /messages. Use x-api-key for the BeansAI key; Authorization: Bearer also works.

shell

curl https://api.beansai.dev/v1/messages \
  -H "x-api-key: sk-beans-..." \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-fable-5",
    "max_tokens": 4096,
    "messages": [
      {"role": "user", "content": "Analyze this design and list the top risks."}
    ]
  }'

Thinking

Use claude-opus-4-6-thinking when you need explicit reasoning budget. In OpenAI-compatible requests, send reasoning_effort. In Anthropic-native requests, send a thinking block.

shell

curl https://api.beansai.dev/v1/messages \
  -H "x-api-key: sk-beans-..." \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-6-thinking",
    "max_tokens": 12000,
    "thinking": {
      "type": "enabled",
      "budget_tokens": 8192
    },
    "messages": [
      {"role": "user", "content": "Debug this incident from logs and propose the safest rollback."}
    ]
  }'

Model examples

These JSON bodies can be pasted into the curl example above.

Claude Fable 5

claude-fable-5

Anthropic's most capable widely released model for demanding reasoning, long-horizon agents, and large-context work.

json

{
  "model": "claude-fable-5",
  "messages": [
    {
      "role": "user",
      "content": "Analyze this multi-month migration, identify hidden risks, and produce an execution plan with fallback points."
    }
  ],
  "max_tokens": 4096,
  "stream": false
}

Claude Opus 4.8

claude-opus-4-8

Highest-capability Claude model for deep research, large codebases, long documents, and high-stakes reviews.

json

{
  "model": "claude-opus-4-8",
  "messages": [
    {
      "role": "user",
      "content": "Review this architecture, identify failure modes, and propose a phased migration plan."
    }
  ],
  "max_tokens": 4096,
  "stream": false
}

Claude Opus 4.7

claude-opus-4-7

High-capability Opus for complex reasoning, adaptive thinking, and long-context agent work.

json

{
  "model": "claude-opus-4-7",
  "messages": [
    {
      "role": "user",
      "content": "Compare these migration options, identify risks, and recommend the safest sequence."
    }
  ],
  "max_tokens": 4096,
  "stream": false
}

Claude Opus 4.6

claude-opus-4-6

Premium reasoning and writing when you want Opus quality with a stable general-purpose profile.

json

{
  "model": "claude-opus-4-6",
  "messages": [
    {
      "role": "user",
      "content": "Turn these product notes into a precise technical spec with risks and acceptance criteria."
    }
  ],
  "max_tokens": 4096,
  "stream": false
}

Claude Opus 4.6 (Thinking)

claude-opus-4-6-thinking

Hard reasoning tasks that benefit from explicit thinking budget, such as debugging, planning, and multi-step analysis.

json

{
  "model": "claude-opus-4-6-thinking",
  "messages": [
    {
      "role": "user",
      "content": "Trace this production bug from symptoms to root cause. Show assumptions, tests, and the smallest safe fix."
    }
  ],
  "max_tokens": 4096,
  "stream": false,
  "reasoning_effort": "high"
}

Claude Sonnet 4.6

claude-sonnet-4-6

Default Claude choice for coding, agents, chat products, tool use, and balanced latency/cost.

json

{
  "model": "claude-sonnet-4-6",
  "messages": [
    {
      "role": "user",
      "content": "Implement this feature, keep the API contract stable, and summarize the changed files."
    }
  ],
  "max_tokens": 4096,
  "stream": false
}

Claude Haiku 4.5

claude-haiku-4-5-20251001

Fast, lower-cost Claude for classification, extraction, lightweight support replies, and background jobs.

json

{
  "model": "claude-haiku-4-5-20251001",
  "messages": [
    {
      "role": "user",
      "content": "Extract company name, contact email, urgency, and requested action from this message."
    }
  ],
  "max_tokens": 1024,
  "stream": false
}

Claude Code

After configuring Claude Code for BeansAI, switch models inside the CLI with /model.

Claude Code

/model claude-fable-5
/model claude-opus-4-8
/model claude-opus-4-7
/model claude-opus-4-6
/model claude-opus-4-6-thinking
/model claude-sonnet-4-6
/model claude-haiku-4-5-20251001

Tips

Keep stream: true for long Claude responses so clients receive tokens as they arrive.
Claude Fable 5 uses always-on adaptive thinking; control depth with effort rather than a manual thinking budget.
Request cost is returned in X-Request-Cost-Micro-Usd, and Anthropic rate-limit headers are forwarded when upstream provides them.