🤖 feat: Claude Opus 4.6 - 1M Context, Premium Pricing, Adaptive Thinking (#11670)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions

* feat: Implement new features for Claude Opus 4.6 model

- Added support for tiered pricing based on input token count for the Claude Opus 4.6 model.
- Updated token value calculations to include inputTokenCount for accurate pricing.
- Enhanced transaction handling to apply premium rates when input tokens exceed defined thresholds.
- Introduced comprehensive tests to validate pricing logic for both standard and premium rates across various scenarios.
- Updated related utility functions and models to accommodate new pricing structure.

This change improves the flexibility and accuracy of token pricing for the Claude Opus 4.6 model, ensuring users are charged appropriately based on their usage.

* feat: Add effort field to conversation and preset schemas

- Introduced a new optional `effort` field of type `String` in both the `IPreset` and `IConversation` interfaces.
- Updated the `conversationPreset` schema to include the `effort` field, enhancing the data structure for better context management.

* chore: Clean up unused variable and comments in initialize function

* chore: update dependencies and SDK versions

- Updated @anthropic-ai/sdk to version 0.73.0 in package.json and overrides.
- Updated @anthropic-ai/vertex-sdk to version 0.14.3 in packages/api/package.json.
- Updated @librechat/agents to version 3.1.34 in packages/api/package.json.
- Refactored imports in packages/api/src/endpoints/anthropic/vertex.ts for consistency.

* chore: remove postcss-loader from dependencies

* feat: Bedrock model support for adaptive thinking configuration

- Updated .env.example to include new Bedrock model IDs for Claude Opus 4.6.
- Refactored bedrockInputParser to support adaptive thinking for Opus models, allowing for dynamic thinking configurations.
- Introduced a new function to check model compatibility with adaptive thinking.
- Added an optional `effort` field to the input schemas and updated related configurations.
- Enhanced tests to validate the new adaptive thinking logic and model configurations.

* feat: Add tests for Opus 4.6 adaptive thinking configuration

* feat: Update model references for Opus 4.6 by removing version suffix

* feat: Update @librechat/agents to version 3.1.35 in package.json and package-lock.json

* chore: @librechat/agents to version 3.1.36 in package.json and package-lock.json

* feat: Normalize inputTokenCount for spendTokens and enhance transaction handling

- Introduced normalization for promptTokens to ensure inputTokenCount does not go negative.
- Updated transaction logic to reflect normalized inputTokenCount in pricing calculations.
- Added comprehensive tests to validate the new normalization logic and its impact on transaction rates for both standard and premium models.
- Refactored related functions to improve clarity and maintainability of token value calculations.

* chore: Simplify adaptive thinking configuration in helpers.ts

- Removed unnecessary type casting for the thinking property in updatedOptions.
- Ensured that adaptive thinking is directly assigned when conditions are met, improving code clarity.

* refactor: Replace hard-coded token values with dynamic retrieval from maxTokensMap in model tests

* fix: Ensure non-negative token values in spendTokens calculations

- Updated token value retrieval to use Math.max for prompt and completion tokens, preventing negative values.
- Enhanced clarity in token calculations for both prompt and completion transactions.

* test: Add test for normalization of negative structured token values in spendStructuredTokens

- Implemented a test to ensure that negative structured token values are normalized to zero during token spending.
- Verified that the transaction rates remain consistent with the expected standard values after normalization.

* refactor: Bedrock model support for adaptive thinking and context handling

- Added tests for various alternate naming conventions of Claude models to validate adaptive thinking and context support.
- Refactored `supportsAdaptiveThinking` and `supportsContext1m` functions to utilize new parsing methods for model version extraction.
- Updated `bedrockInputParser` to handle effort configurations more effectively and strip unnecessary fields for non-adaptive models.
- Improved handling of anthropic model configurations in the input parser.

* fix: Improve token value retrieval in getMultiplier function

- Updated the token value retrieval logic to use optional chaining for better safety against undefined values.
- Added a test case to ensure that the function returns the default rate when the provided valueKey does not exist in tokenValues.
This commit is contained in:
Danny Avila 2026-02-06 18:35:36 -05:00 committed by GitHub
parent 1d5f2eb04b
commit 41e2348d47
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
32 changed files with 2902 additions and 1087 deletions

View file

@ -131,7 +131,7 @@ PROXY=
#============#
ANTHROPIC_API_KEY=user_provided
# ANTHROPIC_MODELS=claude-opus-4-20250514,claude-sonnet-4-20250514,claude-3-7-sonnet-20250219,claude-3-5-sonnet-20241022,claude-3-5-haiku-20241022,claude-3-opus-20240229,claude-3-sonnet-20240229,claude-3-haiku-20240307
# ANTHROPIC_MODELS=claude-opus-4-6,claude-opus-4-20250514,claude-sonnet-4-20250514,claude-3-7-sonnet-20250219,claude-3-5-sonnet-20241022,claude-3-5-haiku-20241022,claude-3-opus-20240229,claude-3-sonnet-20240229,claude-3-haiku-20240307
# ANTHROPIC_REVERSE_PROXY=
# Set to true to use Anthropic models through Google Vertex AI instead of direct API
@ -166,7 +166,8 @@ ANTHROPIC_API_KEY=user_provided
# BEDROCK_AWS_SESSION_TOKEN=someSessionToken
# Note: This example list is not meant to be exhaustive. If omitted, all known, supported model IDs will be included for you.
# BEDROCK_AWS_MODELS=anthropic.claude-3-5-sonnet-20240620-v1:0,meta.llama3-1-8b-instruct-v1:0
# BEDROCK_AWS_MODELS=anthropic.claude-opus-4-6-v1,anthropic.claude-3-5-sonnet-20240620-v1:0,meta.llama3-1-8b-instruct-v1:0
# Cross-region inference model IDs: us.anthropic.claude-opus-4-6-v1,global.anthropic.claude-opus-4-6-v1
# See all Bedrock model IDs here: https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html#model-ids-arns

View file

@ -138,11 +138,10 @@ const updateBalance = async ({ user, incrementValue, setValues }) => {
/** Method to calculate and set the tokenValue for a transaction */
function calculateTokenValue(txn) {
if (!txn.valueKey || !txn.tokenType) {
txn.tokenValue = txn.rawAmount;
}
const { valueKey, tokenType, model, endpointTokenConfig } = txn;
const multiplier = Math.abs(getMultiplier({ valueKey, tokenType, model, endpointTokenConfig }));
const { valueKey, tokenType, model, endpointTokenConfig, inputTokenCount } = txn;
const multiplier = Math.abs(
getMultiplier({ valueKey, tokenType, model, endpointTokenConfig, inputTokenCount }),
);
txn.rate = multiplier;
txn.tokenValue = txn.rawAmount * multiplier;
if (txn.context && txn.tokenType === 'completion' && txn.context === 'incomplete') {
@ -166,6 +165,7 @@ async function createAutoRefillTransaction(txData) {
}
const transaction = new Transaction(txData);
transaction.endpointTokenConfig = txData.endpointTokenConfig;
transaction.inputTokenCount = txData.inputTokenCount;
calculateTokenValue(transaction);
await transaction.save();
@ -200,6 +200,7 @@ async function createTransaction(_txData) {
const transaction = new Transaction(txData);
transaction.endpointTokenConfig = txData.endpointTokenConfig;
transaction.inputTokenCount = txData.inputTokenCount;
calculateTokenValue(transaction);
await transaction.save();
@ -231,10 +232,9 @@ async function createStructuredTransaction(_txData) {
return;
}
const transaction = new Transaction({
...txData,
endpointTokenConfig: txData.endpointTokenConfig,
});
const transaction = new Transaction(txData);
transaction.endpointTokenConfig = txData.endpointTokenConfig;
transaction.inputTokenCount = txData.inputTokenCount;
calculateStructuredTokenValue(transaction);
@ -266,10 +266,15 @@ function calculateStructuredTokenValue(txn) {
return;
}
const { model, endpointTokenConfig } = txn;
const { model, endpointTokenConfig, inputTokenCount } = txn;
if (txn.tokenType === 'prompt') {
const inputMultiplier = getMultiplier({ tokenType: 'prompt', model, endpointTokenConfig });
const inputMultiplier = getMultiplier({
tokenType: 'prompt',
model,
endpointTokenConfig,
inputTokenCount,
});
const writeMultiplier =
getCacheMultiplier({ cacheType: 'write', model, endpointTokenConfig }) ?? inputMultiplier;
const readMultiplier =
@ -304,7 +309,12 @@ function calculateStructuredTokenValue(txn) {
txn.rawAmount = -totalPromptTokens;
} else if (txn.tokenType === 'completion') {
const multiplier = getMultiplier({ tokenType: txn.tokenType, model, endpointTokenConfig });
const multiplier = getMultiplier({
tokenType: txn.tokenType,
model,
endpointTokenConfig,
inputTokenCount,
});
txn.rate = Math.abs(multiplier);
txn.tokenValue = -Math.abs(txn.rawAmount) * multiplier;
txn.rawAmount = -Math.abs(txn.rawAmount);

View file

@ -1,7 +1,7 @@
const mongoose = require('mongoose');
const { MongoMemoryServer } = require('mongodb-memory-server');
const { spendTokens, spendStructuredTokens } = require('./spendTokens');
const { getMultiplier, getCacheMultiplier } = require('./tx');
const { getMultiplier, getCacheMultiplier, premiumTokenValues, tokenValues } = require('./tx');
const { createTransaction, createStructuredTransaction } = require('./Transaction');
const { Balance, Transaction } = require('~/db/models');
@ -564,3 +564,291 @@ describe('Transactions Config Tests', () => {
expect(balance.tokenCredits).toBe(initialBalance);
});
});
describe('calculateTokenValue Edge Cases', () => {
test('should derive multiplier from model when valueKey is not provided', async () => {
const userId = new mongoose.Types.ObjectId();
const initialBalance = 100000000;
await Balance.create({ user: userId, tokenCredits: initialBalance });
const model = 'gpt-4';
const promptTokens = 1000;
const result = await createTransaction({
user: userId,
conversationId: 'test-no-valuekey',
model,
tokenType: 'prompt',
rawAmount: -promptTokens,
context: 'test',
balance: { enabled: true },
});
const expectedRate = getMultiplier({ model, tokenType: 'prompt' });
expect(result.rate).toBe(expectedRate);
const tx = await Transaction.findOne({ user: userId });
expect(tx.tokenValue).toBe(-promptTokens * expectedRate);
expect(tx.rate).toBe(expectedRate);
});
test('should derive valueKey and apply correct rate for an unknown model with tokenType', async () => {
const userId = new mongoose.Types.ObjectId();
const initialBalance = 100000000;
await Balance.create({ user: userId, tokenCredits: initialBalance });
await createTransaction({
user: userId,
conversationId: 'test-unknown-model',
model: 'some-unrecognized-model-xyz',
tokenType: 'prompt',
rawAmount: -500,
context: 'test',
balance: { enabled: true },
});
const tx = await Transaction.findOne({ user: userId });
expect(tx.rate).toBeDefined();
expect(tx.rate).toBeGreaterThan(0);
expect(tx.tokenValue).toBe(tx.rawAmount * tx.rate);
});
test('should correctly apply model-derived multiplier without valueKey for completion', async () => {
const userId = new mongoose.Types.ObjectId();
const initialBalance = 100000000;
await Balance.create({ user: userId, tokenCredits: initialBalance });
const model = 'claude-opus-4-6';
const completionTokens = 500;
const result = await createTransaction({
user: userId,
conversationId: 'test-completion-no-valuekey',
model,
tokenType: 'completion',
rawAmount: -completionTokens,
context: 'test',
balance: { enabled: true },
});
const expectedRate = getMultiplier({ model, tokenType: 'completion' });
expect(expectedRate).toBe(tokenValues[model].completion);
expect(result.rate).toBe(expectedRate);
const updatedBalance = await Balance.findOne({ user: userId });
expect(updatedBalance.tokenCredits).toBeCloseTo(
initialBalance - completionTokens * expectedRate,
0,
);
});
});
describe('Premium Token Pricing Integration Tests', () => {
test('spendTokens should apply standard pricing when prompt tokens are below premium threshold', async () => {
const userId = new mongoose.Types.ObjectId();
const initialBalance = 100000000;
await Balance.create({ user: userId, tokenCredits: initialBalance });
const model = 'claude-opus-4-6';
const promptTokens = 100000;
const completionTokens = 500;
const txData = {
user: userId,
conversationId: 'test-premium-below',
model,
context: 'test',
endpointTokenConfig: null,
balance: { enabled: true },
};
await spendTokens(txData, { promptTokens, completionTokens });
const standardPromptRate = tokenValues[model].prompt;
const standardCompletionRate = tokenValues[model].completion;
const expectedCost =
promptTokens * standardPromptRate + completionTokens * standardCompletionRate;
const updatedBalance = await Balance.findOne({ user: userId });
expect(updatedBalance.tokenCredits).toBeCloseTo(initialBalance - expectedCost, 0);
});
test('spendTokens should apply premium pricing when prompt tokens exceed premium threshold', async () => {
const userId = new mongoose.Types.ObjectId();
const initialBalance = 100000000;
await Balance.create({ user: userId, tokenCredits: initialBalance });
const model = 'claude-opus-4-6';
const promptTokens = 250000;
const completionTokens = 500;
const txData = {
user: userId,
conversationId: 'test-premium-above',
model,
context: 'test',
endpointTokenConfig: null,
balance: { enabled: true },
};
await spendTokens(txData, { promptTokens, completionTokens });
const premiumPromptRate = premiumTokenValues[model].prompt;
const premiumCompletionRate = premiumTokenValues[model].completion;
const expectedCost =
promptTokens * premiumPromptRate + completionTokens * premiumCompletionRate;
const updatedBalance = await Balance.findOne({ user: userId });
expect(updatedBalance.tokenCredits).toBeCloseTo(initialBalance - expectedCost, 0);
});
test('spendTokens should apply standard pricing at exactly the premium threshold', async () => {
const userId = new mongoose.Types.ObjectId();
const initialBalance = 100000000;
await Balance.create({ user: userId, tokenCredits: initialBalance });
const model = 'claude-opus-4-6';
const promptTokens = premiumTokenValues[model].threshold;
const completionTokens = 500;
const txData = {
user: userId,
conversationId: 'test-premium-exact',
model,
context: 'test',
endpointTokenConfig: null,
balance: { enabled: true },
};
await spendTokens(txData, { promptTokens, completionTokens });
const standardPromptRate = tokenValues[model].prompt;
const standardCompletionRate = tokenValues[model].completion;
const expectedCost =
promptTokens * standardPromptRate + completionTokens * standardCompletionRate;
const updatedBalance = await Balance.findOne({ user: userId });
expect(updatedBalance.tokenCredits).toBeCloseTo(initialBalance - expectedCost, 0);
});
test('spendStructuredTokens should apply premium pricing when total input tokens exceed threshold', async () => {
const userId = new mongoose.Types.ObjectId();
const initialBalance = 100000000;
await Balance.create({ user: userId, tokenCredits: initialBalance });
const model = 'claude-opus-4-6';
const txData = {
user: userId,
conversationId: 'test-structured-premium',
model,
context: 'message',
endpointTokenConfig: null,
balance: { enabled: true },
};
const tokenUsage = {
promptTokens: {
input: 200000,
write: 10000,
read: 5000,
},
completionTokens: 1000,
};
const totalInput =
tokenUsage.promptTokens.input + tokenUsage.promptTokens.write + tokenUsage.promptTokens.read;
await spendStructuredTokens(txData, tokenUsage);
const premiumPromptRate = premiumTokenValues[model].prompt;
const premiumCompletionRate = premiumTokenValues[model].completion;
const writeMultiplier = getCacheMultiplier({ model, cacheType: 'write' });
const readMultiplier = getCacheMultiplier({ model, cacheType: 'read' });
const expectedPromptCost =
tokenUsage.promptTokens.input * premiumPromptRate +
tokenUsage.promptTokens.write * writeMultiplier +
tokenUsage.promptTokens.read * readMultiplier;
const expectedCompletionCost = tokenUsage.completionTokens * premiumCompletionRate;
const expectedTotalCost = expectedPromptCost + expectedCompletionCost;
const updatedBalance = await Balance.findOne({ user: userId });
expect(totalInput).toBeGreaterThan(premiumTokenValues[model].threshold);
expect(updatedBalance.tokenCredits).toBeCloseTo(initialBalance - expectedTotalCost, 0);
});
test('spendStructuredTokens should apply standard pricing when total input tokens are below threshold', async () => {
const userId = new mongoose.Types.ObjectId();
const initialBalance = 100000000;
await Balance.create({ user: userId, tokenCredits: initialBalance });
const model = 'claude-opus-4-6';
const txData = {
user: userId,
conversationId: 'test-structured-standard',
model,
context: 'message',
endpointTokenConfig: null,
balance: { enabled: true },
};
const tokenUsage = {
promptTokens: {
input: 50000,
write: 10000,
read: 5000,
},
completionTokens: 1000,
};
const totalInput =
tokenUsage.promptTokens.input + tokenUsage.promptTokens.write + tokenUsage.promptTokens.read;
await spendStructuredTokens(txData, tokenUsage);
const standardPromptRate = tokenValues[model].prompt;
const standardCompletionRate = tokenValues[model].completion;
const writeMultiplier = getCacheMultiplier({ model, cacheType: 'write' });
const readMultiplier = getCacheMultiplier({ model, cacheType: 'read' });
const expectedPromptCost =
tokenUsage.promptTokens.input * standardPromptRate +
tokenUsage.promptTokens.write * writeMultiplier +
tokenUsage.promptTokens.read * readMultiplier;
const expectedCompletionCost = tokenUsage.completionTokens * standardCompletionRate;
const expectedTotalCost = expectedPromptCost + expectedCompletionCost;
const updatedBalance = await Balance.findOne({ user: userId });
expect(totalInput).toBeLessThanOrEqual(premiumTokenValues[model].threshold);
expect(updatedBalance.tokenCredits).toBeCloseTo(initialBalance - expectedTotalCost, 0);
});
test('non-premium models should not be affected by inputTokenCount regardless of prompt size', async () => {
const userId = new mongoose.Types.ObjectId();
const initialBalance = 100000000;
await Balance.create({ user: userId, tokenCredits: initialBalance });
const model = 'claude-opus-4-5';
const promptTokens = 300000;
const completionTokens = 500;
const txData = {
user: userId,
conversationId: 'test-no-premium',
model,
context: 'test',
endpointTokenConfig: null,
balance: { enabled: true },
};
await spendTokens(txData, { promptTokens, completionTokens });
const standardPromptRate = getMultiplier({ model, tokenType: 'prompt' });
const standardCompletionRate = getMultiplier({ model, tokenType: 'completion' });
const expectedCost =
promptTokens * standardPromptRate + completionTokens * standardCompletionRate;
const updatedBalance = await Balance.findOne({ user: userId });
expect(updatedBalance.tokenCredits).toBeCloseTo(initialBalance - expectedCost, 0);
});
});

View file

@ -24,12 +24,14 @@ const spendTokens = async (txData, tokenUsage) => {
},
);
let prompt, completion;
const normalizedPromptTokens = Math.max(promptTokens ?? 0, 0);
try {
if (promptTokens !== undefined) {
prompt = await createTransaction({
...txData,
tokenType: 'prompt',
rawAmount: promptTokens === 0 ? 0 : -Math.max(promptTokens, 0),
rawAmount: promptTokens === 0 ? 0 : -normalizedPromptTokens,
inputTokenCount: normalizedPromptTokens,
});
}
@ -38,6 +40,7 @@ const spendTokens = async (txData, tokenUsage) => {
...txData,
tokenType: 'completion',
rawAmount: completionTokens === 0 ? 0 : -Math.max(completionTokens, 0),
inputTokenCount: normalizedPromptTokens,
});
}
@ -87,21 +90,31 @@ const spendStructuredTokens = async (txData, tokenUsage) => {
let prompt, completion;
try {
if (promptTokens) {
const { input = 0, write = 0, read = 0 } = promptTokens;
const input = Math.max(promptTokens.input ?? 0, 0);
const write = Math.max(promptTokens.write ?? 0, 0);
const read = Math.max(promptTokens.read ?? 0, 0);
const totalInputTokens = input + write + read;
prompt = await createStructuredTransaction({
...txData,
tokenType: 'prompt',
inputTokens: -input,
writeTokens: -write,
readTokens: -read,
inputTokenCount: totalInputTokens,
});
}
if (completionTokens) {
const totalInputTokens = promptTokens
? Math.max(promptTokens.input ?? 0, 0) +
Math.max(promptTokens.write ?? 0, 0) +
Math.max(promptTokens.read ?? 0, 0)
: undefined;
completion = await createTransaction({
...txData,
tokenType: 'completion',
rawAmount: -completionTokens,
rawAmount: -Math.max(completionTokens, 0),
inputTokenCount: totalInputTokens,
});
}

View file

@ -1,7 +1,8 @@
const mongoose = require('mongoose');
const { MongoMemoryServer } = require('mongodb-memory-server');
const { spendTokens, spendStructuredTokens } = require('./spendTokens');
const { createTransaction, createAutoRefillTransaction } = require('./Transaction');
const { tokenValues, premiumTokenValues, getCacheMultiplier } = require('./tx');
const { spendTokens, spendStructuredTokens } = require('./spendTokens');
require('~/db/models');
@ -734,4 +735,328 @@ describe('spendTokens', () => {
expect(balance).toBeDefined();
expect(balance.tokenCredits).toBeLessThan(10000); // Balance should be reduced
});
describe('premium token pricing', () => {
it('should charge standard rates for claude-opus-4-6 when prompt tokens are below threshold', async () => {
const initialBalance = 100000000;
await Balance.create({
user: userId,
tokenCredits: initialBalance,
});
const model = 'claude-opus-4-6';
const promptTokens = 100000;
const completionTokens = 500;
const txData = {
user: userId,
conversationId: 'test-standard-pricing',
model,
context: 'test',
balance: { enabled: true },
};
await spendTokens(txData, { promptTokens, completionTokens });
const expectedCost =
promptTokens * tokenValues[model].prompt + completionTokens * tokenValues[model].completion;
const balance = await Balance.findOne({ user: userId });
expect(balance.tokenCredits).toBeCloseTo(initialBalance - expectedCost, 0);
});
it('should charge premium rates for claude-opus-4-6 when prompt tokens exceed threshold', async () => {
const initialBalance = 100000000;
await Balance.create({
user: userId,
tokenCredits: initialBalance,
});
const model = 'claude-opus-4-6';
const promptTokens = 250000;
const completionTokens = 500;
const txData = {
user: userId,
conversationId: 'test-premium-pricing',
model,
context: 'test',
balance: { enabled: true },
};
await spendTokens(txData, { promptTokens, completionTokens });
const expectedCost =
promptTokens * premiumTokenValues[model].prompt +
completionTokens * premiumTokenValues[model].completion;
const balance = await Balance.findOne({ user: userId });
expect(balance.tokenCredits).toBeCloseTo(initialBalance - expectedCost, 0);
});
it('should charge premium rates for both prompt and completion in structured tokens when above threshold', async () => {
const initialBalance = 100000000;
await Balance.create({
user: userId,
tokenCredits: initialBalance,
});
const model = 'claude-opus-4-6';
const txData = {
user: userId,
conversationId: 'test-structured-premium',
model,
context: 'test',
balance: { enabled: true },
};
const tokenUsage = {
promptTokens: {
input: 200000,
write: 10000,
read: 5000,
},
completionTokens: 1000,
};
const result = await spendStructuredTokens(txData, tokenUsage);
const premiumPromptRate = premiumTokenValues[model].prompt;
const premiumCompletionRate = premiumTokenValues[model].completion;
const writeRate = getCacheMultiplier({ model, cacheType: 'write' });
const readRate = getCacheMultiplier({ model, cacheType: 'read' });
const expectedPromptCost =
tokenUsage.promptTokens.input * premiumPromptRate +
tokenUsage.promptTokens.write * writeRate +
tokenUsage.promptTokens.read * readRate;
const expectedCompletionCost = tokenUsage.completionTokens * premiumCompletionRate;
expect(result.prompt.prompt).toBeCloseTo(-expectedPromptCost, 0);
expect(result.completion.completion).toBeCloseTo(-expectedCompletionCost, 0);
});
it('should charge standard rates for structured tokens when below threshold', async () => {
const initialBalance = 100000000;
await Balance.create({
user: userId,
tokenCredits: initialBalance,
});
const model = 'claude-opus-4-6';
const txData = {
user: userId,
conversationId: 'test-structured-standard',
model,
context: 'test',
balance: { enabled: true },
};
const tokenUsage = {
promptTokens: {
input: 50000,
write: 10000,
read: 5000,
},
completionTokens: 1000,
};
const result = await spendStructuredTokens(txData, tokenUsage);
const standardPromptRate = tokenValues[model].prompt;
const standardCompletionRate = tokenValues[model].completion;
const writeRate = getCacheMultiplier({ model, cacheType: 'write' });
const readRate = getCacheMultiplier({ model, cacheType: 'read' });
const expectedPromptCost =
tokenUsage.promptTokens.input * standardPromptRate +
tokenUsage.promptTokens.write * writeRate +
tokenUsage.promptTokens.read * readRate;
const expectedCompletionCost = tokenUsage.completionTokens * standardCompletionRate;
expect(result.prompt.prompt).toBeCloseTo(-expectedPromptCost, 0);
expect(result.completion.completion).toBeCloseTo(-expectedCompletionCost, 0);
});
it('should not apply premium pricing to non-premium models regardless of prompt size', async () => {
const initialBalance = 100000000;
await Balance.create({
user: userId,
tokenCredits: initialBalance,
});
const model = 'claude-opus-4-5';
const promptTokens = 300000;
const completionTokens = 500;
const txData = {
user: userId,
conversationId: 'test-no-premium',
model,
context: 'test',
balance: { enabled: true },
};
await spendTokens(txData, { promptTokens, completionTokens });
const expectedCost =
promptTokens * tokenValues[model].prompt + completionTokens * tokenValues[model].completion;
const balance = await Balance.findOne({ user: userId });
expect(balance.tokenCredits).toBeCloseTo(initialBalance - expectedCost, 0);
});
});
describe('inputTokenCount Normalization', () => {
it('should normalize negative promptTokens to zero for inputTokenCount', async () => {
await Balance.create({
user: userId,
tokenCredits: 100000000,
});
const txData = {
user: userId,
conversationId: 'test-negative-prompt',
model: 'claude-opus-4-6',
context: 'test',
balance: { enabled: true },
};
await spendTokens(txData, { promptTokens: -500, completionTokens: 100 });
const transactions = await Transaction.find({ user: userId }).sort({ tokenType: 1 });
const completionTx = transactions.find((t) => t.tokenType === 'completion');
const promptTx = transactions.find((t) => t.tokenType === 'prompt');
expect(Math.abs(promptTx.rawAmount)).toBe(0);
expect(completionTx.rawAmount).toBe(-100);
const standardCompletionRate = tokenValues['claude-opus-4-6'].completion;
expect(completionTx.rate).toBe(standardCompletionRate);
});
it('should use normalized inputTokenCount for premium threshold check on completion', async () => {
const initialBalance = 100000000;
await Balance.create({
user: userId,
tokenCredits: initialBalance,
});
const model = 'claude-opus-4-6';
const promptTokens = 250000;
const completionTokens = 500;
const txData = {
user: userId,
conversationId: 'test-normalized-premium',
model,
context: 'test',
balance: { enabled: true },
};
await spendTokens(txData, { promptTokens, completionTokens });
const transactions = await Transaction.find({ user: userId }).sort({ tokenType: 1 });
const completionTx = transactions.find((t) => t.tokenType === 'completion');
const promptTx = transactions.find((t) => t.tokenType === 'prompt');
const premiumPromptRate = premiumTokenValues[model].prompt;
const premiumCompletionRate = premiumTokenValues[model].completion;
expect(promptTx.rate).toBe(premiumPromptRate);
expect(completionTx.rate).toBe(premiumCompletionRate);
});
it('should keep inputTokenCount as zero when promptTokens is zero', async () => {
await Balance.create({
user: userId,
tokenCredits: 100000000,
});
const txData = {
user: userId,
conversationId: 'test-zero-prompt',
model: 'claude-opus-4-6',
context: 'test',
balance: { enabled: true },
};
await spendTokens(txData, { promptTokens: 0, completionTokens: 100 });
const transactions = await Transaction.find({ user: userId }).sort({ tokenType: 1 });
const completionTx = transactions.find((t) => t.tokenType === 'completion');
const promptTx = transactions.find((t) => t.tokenType === 'prompt');
expect(Math.abs(promptTx.rawAmount)).toBe(0);
const standardCompletionRate = tokenValues['claude-opus-4-6'].completion;
expect(completionTx.rate).toBe(standardCompletionRate);
});
it('should not trigger premium pricing with negative promptTokens on premium model', async () => {
const initialBalance = 100000000;
await Balance.create({
user: userId,
tokenCredits: initialBalance,
});
const model = 'claude-opus-4-6';
const txData = {
user: userId,
conversationId: 'test-negative-no-premium',
model,
context: 'test',
balance: { enabled: true },
};
await spendTokens(txData, { promptTokens: -300000, completionTokens: 500 });
const transactions = await Transaction.find({ user: userId }).sort({ tokenType: 1 });
const completionTx = transactions.find((t) => t.tokenType === 'completion');
const standardCompletionRate = tokenValues[model].completion;
expect(completionTx.rate).toBe(standardCompletionRate);
});
it('should normalize negative structured token values to zero in spendStructuredTokens', async () => {
const initialBalance = 100000000;
await Balance.create({
user: userId,
tokenCredits: initialBalance,
});
const model = 'claude-opus-4-6';
const txData = {
user: userId,
conversationId: 'test-negative-structured',
model,
context: 'test',
balance: { enabled: true },
};
const tokenUsage = {
promptTokens: { input: -100, write: 50, read: -30 },
completionTokens: -200,
};
await spendStructuredTokens(txData, tokenUsage);
const transactions = await Transaction.find({
user: userId,
conversationId: 'test-negative-structured',
}).sort({ tokenType: 1 });
const completionTx = transactions.find((t) => t.tokenType === 'completion');
const promptTx = transactions.find((t) => t.tokenType === 'prompt');
expect(Math.abs(promptTx.inputTokens)).toBe(0);
expect(promptTx.writeTokens).toBe(-50);
expect(Math.abs(promptTx.readTokens)).toBe(0);
expect(Math.abs(completionTx.rawAmount)).toBe(0);
const standardRate = tokenValues[model].completion;
expect(completionTx.rate).toBe(standardRate);
});
});
});

View file

@ -174,6 +174,7 @@ const tokenValues = Object.assign(
'claude-haiku-4-5': { prompt: 1, completion: 5 },
'claude-opus-4': { prompt: 15, completion: 75 },
'claude-opus-4-5': { prompt: 5, completion: 25 },
'claude-opus-4-6': { prompt: 5, completion: 25 },
'claude-sonnet-4': { prompt: 3, completion: 15 },
'command-r': { prompt: 0.5, completion: 1.5 },
'command-r-plus': { prompt: 3, completion: 15 },
@ -310,6 +311,7 @@ const cacheTokenValues = {
'claude-sonnet-4': { write: 3.75, read: 0.3 },
'claude-opus-4': { write: 18.75, read: 1.5 },
'claude-opus-4-5': { write: 6.25, read: 0.5 },
'claude-opus-4-6': { write: 6.25, read: 0.5 },
// DeepSeek models - cache hit: $0.028/1M, cache miss: $0.28/1M
deepseek: { write: 0.28, read: 0.028 },
'deepseek-chat': { write: 0.28, read: 0.028 },
@ -328,6 +330,15 @@ const cacheTokenValues = {
'kimi-k2-thinking-turbo': { write: 1.15, read: 0.15 },
};
/**
* Premium (tiered) pricing for models whose rates change based on prompt size.
* Each entry specifies the token threshold and the rates that apply above it.
* @type {Object.<string, {threshold: number, prompt: number, completion: number}>}
*/
const premiumTokenValues = {
'claude-opus-4-6': { threshold: 200000, prompt: 10, completion: 37.5 },
};
/**
* Retrieves the key associated with a given model name.
*
@ -384,15 +395,27 @@ const getValueKey = (model, endpoint) => {
* @param {string} [params.model] - The model name to derive the value key from if not provided.
* @param {string} [params.endpoint] - The endpoint name to derive the value key from if not provided.
* @param {EndpointTokenConfig} [params.endpointTokenConfig] - The token configuration for the endpoint.
* @param {number} [params.inputTokenCount] - Total input token count for tiered pricing.
* @returns {number} The multiplier for the given parameters, or a default value if not found.
*/
const getMultiplier = ({ valueKey, tokenType, model, endpoint, endpointTokenConfig }) => {
const getMultiplier = ({
model,
valueKey,
endpoint,
tokenType,
inputTokenCount,
endpointTokenConfig,
}) => {
if (endpointTokenConfig) {
return endpointTokenConfig?.[model]?.[tokenType] ?? defaultRate;
}
if (valueKey && tokenType) {
return tokenValues[valueKey][tokenType] ?? defaultRate;
const premiumRate = getPremiumRate(valueKey, tokenType, inputTokenCount);
if (premiumRate != null) {
return premiumRate;
}
return tokenValues[valueKey]?.[tokenType] ?? defaultRate;
}
if (!tokenType || !model) {
@ -404,10 +427,33 @@ const getMultiplier = ({ valueKey, tokenType, model, endpoint, endpointTokenConf
return defaultRate;
}
// If we got this far, and values[tokenType] is undefined somehow, return a rough average of default multipliers
const premiumRate = getPremiumRate(valueKey, tokenType, inputTokenCount);
if (premiumRate != null) {
return premiumRate;
}
return tokenValues[valueKey]?.[tokenType] ?? defaultRate;
};
/**
* Checks if premium (tiered) pricing applies and returns the premium rate.
* Each model defines its own threshold in `premiumTokenValues`.
* @param {string} valueKey
* @param {string} tokenType
* @param {number} [inputTokenCount]
* @returns {number|null}
*/
const getPremiumRate = (valueKey, tokenType, inputTokenCount) => {
if (inputTokenCount == null) {
return null;
}
const premiumEntry = premiumTokenValues[valueKey];
if (!premiumEntry || inputTokenCount <= premiumEntry.threshold) {
return null;
}
return premiumEntry[tokenType] ?? null;
};
/**
* Retrieves the cache multiplier for a given value key and token type. If no value key is provided,
* it attempts to derive it from the model name.
@ -444,8 +490,10 @@ const getCacheMultiplier = ({ valueKey, cacheType, model, endpoint, endpointToke
module.exports = {
tokenValues,
premiumTokenValues,
getValueKey,
getMultiplier,
getPremiumRate,
getCacheMultiplier,
defaultRate,
cacheTokenValues,

View file

@ -1,3 +1,4 @@
/** Note: No hard-coded values should be used in this file. */
const { maxTokensMap } = require('@librechat/api');
const { EModelEndpoint } = require('librechat-data-provider');
const {
@ -5,8 +6,10 @@ const {
tokenValues,
getValueKey,
getMultiplier,
getPremiumRate,
cacheTokenValues,
getCacheMultiplier,
premiumTokenValues,
} = require('./tx');
describe('getValueKey', () => {
@ -239,6 +242,15 @@ describe('getMultiplier', () => {
expect(getMultiplier({ valueKey: '8k', tokenType: 'unknownType' })).toBe(defaultRate);
});
it('should return defaultRate if valueKey does not exist in tokenValues', () => {
expect(getMultiplier({ valueKey: 'non-existent-model', tokenType: 'prompt' })).toBe(
defaultRate,
);
expect(getMultiplier({ valueKey: 'non-existent-model', tokenType: 'completion' })).toBe(
defaultRate,
);
});
it('should derive the valueKey from the model if not provided', () => {
expect(getMultiplier({ tokenType: 'prompt', model: 'gpt-4-some-other-info' })).toBe(
tokenValues['8k'].prompt,
@ -334,8 +346,6 @@ describe('getMultiplier', () => {
expect(getMultiplier({ model: 'openai/gpt-5.1', tokenType: 'prompt' })).toBe(
tokenValues['gpt-5.1'].prompt,
);
expect(tokenValues['gpt-5.1'].prompt).toBe(1.25);
expect(tokenValues['gpt-5.1'].completion).toBe(10);
});
it('should return the correct multiplier for gpt-5.2', () => {
@ -348,8 +358,6 @@ describe('getMultiplier', () => {
expect(getMultiplier({ model: 'openai/gpt-5.2', tokenType: 'prompt' })).toBe(
tokenValues['gpt-5.2'].prompt,
);
expect(tokenValues['gpt-5.2'].prompt).toBe(1.75);
expect(tokenValues['gpt-5.2'].completion).toBe(14);
});
it('should return the correct multiplier for gpt-4o', () => {
@ -815,8 +823,6 @@ describe('Deepseek Model Tests', () => {
expect(getMultiplier({ model: 'deepseek-chat', tokenType: 'completion' })).toBe(
tokenValues['deepseek-chat'].completion,
);
expect(tokenValues['deepseek-chat'].prompt).toBe(0.28);
expect(tokenValues['deepseek-chat'].completion).toBe(0.42);
});
it('should return correct pricing for deepseek-reasoner', () => {
@ -826,8 +832,6 @@ describe('Deepseek Model Tests', () => {
expect(getMultiplier({ model: 'deepseek-reasoner', tokenType: 'completion' })).toBe(
tokenValues['deepseek-reasoner'].completion,
);
expect(tokenValues['deepseek-reasoner'].prompt).toBe(0.28);
expect(tokenValues['deepseek-reasoner'].completion).toBe(0.42);
});
it('should handle DeepSeek model name variations with provider prefixes', () => {
@ -840,8 +844,8 @@ describe('Deepseek Model Tests', () => {
modelVariations.forEach((model) => {
const promptMultiplier = getMultiplier({ model, tokenType: 'prompt' });
const completionMultiplier = getMultiplier({ model, tokenType: 'completion' });
expect(promptMultiplier).toBe(0.28);
expect(completionMultiplier).toBe(0.42);
expect(promptMultiplier).toBe(tokenValues['deepseek-chat'].prompt);
expect(completionMultiplier).toBe(tokenValues['deepseek-chat'].completion);
});
});
@ -860,13 +864,13 @@ describe('Deepseek Model Tests', () => {
);
});
it('should return correct cache pricing values for DeepSeek models', () => {
expect(cacheTokenValues['deepseek-chat'].write).toBe(0.28);
expect(cacheTokenValues['deepseek-chat'].read).toBe(0.028);
expect(cacheTokenValues['deepseek-reasoner'].write).toBe(0.28);
expect(cacheTokenValues['deepseek-reasoner'].read).toBe(0.028);
expect(cacheTokenValues['deepseek'].write).toBe(0.28);
expect(cacheTokenValues['deepseek'].read).toBe(0.028);
it('should have consistent cache pricing across DeepSeek model variants', () => {
expect(cacheTokenValues['deepseek'].write).toBe(cacheTokenValues['deepseek-chat'].write);
expect(cacheTokenValues['deepseek'].read).toBe(cacheTokenValues['deepseek-chat'].read);
expect(cacheTokenValues['deepseek-reasoner'].write).toBe(
cacheTokenValues['deepseek-chat'].write,
);
expect(cacheTokenValues['deepseek-reasoner'].read).toBe(cacheTokenValues['deepseek-chat'].read);
});
it('should handle DeepSeek cache multipliers with model variations', () => {
@ -875,8 +879,8 @@ describe('Deepseek Model Tests', () => {
modelVariations.forEach((model) => {
const writeMultiplier = getCacheMultiplier({ model, cacheType: 'write' });
const readMultiplier = getCacheMultiplier({ model, cacheType: 'read' });
expect(writeMultiplier).toBe(0.28);
expect(readMultiplier).toBe(0.028);
expect(writeMultiplier).toBe(cacheTokenValues['deepseek-chat'].write);
expect(readMultiplier).toBe(cacheTokenValues['deepseek-chat'].read);
});
});
});
@ -1876,6 +1880,201 @@ describe('Claude Model Tests', () => {
);
});
});
it('should return correct prompt and completion rates for Claude Opus 4.6', () => {
expect(getMultiplier({ model: 'claude-opus-4-6', tokenType: 'prompt' })).toBe(
tokenValues['claude-opus-4-6'].prompt,
);
expect(getMultiplier({ model: 'claude-opus-4-6', tokenType: 'completion' })).toBe(
tokenValues['claude-opus-4-6'].completion,
);
});
it('should handle Claude Opus 4.6 model name variations', () => {
const modelVariations = [
'claude-opus-4-6',
'claude-opus-4-6-20250801',
'claude-opus-4-6-latest',
'anthropic/claude-opus-4-6',
'claude-opus-4-6/anthropic',
'claude-opus-4-6-preview',
];
modelVariations.forEach((model) => {
const valueKey = getValueKey(model);
expect(valueKey).toBe('claude-opus-4-6');
expect(getMultiplier({ model, tokenType: 'prompt' })).toBe(
tokenValues['claude-opus-4-6'].prompt,
);
expect(getMultiplier({ model, tokenType: 'completion' })).toBe(
tokenValues['claude-opus-4-6'].completion,
);
});
});
it('should return correct cache rates for Claude Opus 4.6', () => {
expect(getCacheMultiplier({ model: 'claude-opus-4-6', cacheType: 'write' })).toBe(
cacheTokenValues['claude-opus-4-6'].write,
);
expect(getCacheMultiplier({ model: 'claude-opus-4-6', cacheType: 'read' })).toBe(
cacheTokenValues['claude-opus-4-6'].read,
);
});
it('should handle Claude Opus 4.6 cache rates with model name variations', () => {
const modelVariations = [
'claude-opus-4-6',
'claude-opus-4-6-20250801',
'claude-opus-4-6-latest',
'anthropic/claude-opus-4-6',
'claude-opus-4-6/anthropic',
'claude-opus-4-6-preview',
];
modelVariations.forEach((model) => {
expect(getCacheMultiplier({ model, cacheType: 'write' })).toBe(
cacheTokenValues['claude-opus-4-6'].write,
);
expect(getCacheMultiplier({ model, cacheType: 'read' })).toBe(
cacheTokenValues['claude-opus-4-6'].read,
);
});
});
});
describe('Premium Token Pricing', () => {
const premiumModel = 'claude-opus-4-6';
const premiumEntry = premiumTokenValues[premiumModel];
const { threshold } = premiumEntry;
const belowThreshold = threshold - 1;
const aboveThreshold = threshold + 1;
const wellAboveThreshold = threshold * 2;
it('should have premium pricing defined for claude-opus-4-6', () => {
expect(premiumEntry).toBeDefined();
expect(premiumEntry.threshold).toBeDefined();
expect(premiumEntry.prompt).toBeDefined();
expect(premiumEntry.completion).toBeDefined();
expect(premiumEntry.prompt).toBeGreaterThan(tokenValues[premiumModel].prompt);
expect(premiumEntry.completion).toBeGreaterThan(tokenValues[premiumModel].completion);
});
it('should return null from getPremiumRate when inputTokenCount is below threshold', () => {
expect(getPremiumRate(premiumModel, 'prompt', belowThreshold)).toBeNull();
expect(getPremiumRate(premiumModel, 'completion', belowThreshold)).toBeNull();
expect(getPremiumRate(premiumModel, 'prompt', threshold)).toBeNull();
});
it('should return premium rate from getPremiumRate when inputTokenCount exceeds threshold', () => {
expect(getPremiumRate(premiumModel, 'prompt', aboveThreshold)).toBe(premiumEntry.prompt);
expect(getPremiumRate(premiumModel, 'completion', aboveThreshold)).toBe(
premiumEntry.completion,
);
expect(getPremiumRate(premiumModel, 'prompt', wellAboveThreshold)).toBe(premiumEntry.prompt);
});
it('should return null from getPremiumRate when inputTokenCount is undefined or null', () => {
expect(getPremiumRate(premiumModel, 'prompt', undefined)).toBeNull();
expect(getPremiumRate(premiumModel, 'prompt', null)).toBeNull();
});
it('should return null from getPremiumRate for models without premium pricing', () => {
expect(getPremiumRate('claude-opus-4-5', 'prompt', wellAboveThreshold)).toBeNull();
expect(getPremiumRate('claude-sonnet-4', 'prompt', wellAboveThreshold)).toBeNull();
expect(getPremiumRate('gpt-4o', 'prompt', wellAboveThreshold)).toBeNull();
});
it('should return standard rate from getMultiplier when inputTokenCount is below threshold', () => {
expect(
getMultiplier({
model: premiumModel,
tokenType: 'prompt',
inputTokenCount: belowThreshold,
}),
).toBe(tokenValues[premiumModel].prompt);
expect(
getMultiplier({
model: premiumModel,
tokenType: 'completion',
inputTokenCount: belowThreshold,
}),
).toBe(tokenValues[premiumModel].completion);
});
it('should return premium rate from getMultiplier when inputTokenCount exceeds threshold', () => {
expect(
getMultiplier({
model: premiumModel,
tokenType: 'prompt',
inputTokenCount: aboveThreshold,
}),
).toBe(premiumEntry.prompt);
expect(
getMultiplier({
model: premiumModel,
tokenType: 'completion',
inputTokenCount: aboveThreshold,
}),
).toBe(premiumEntry.completion);
});
it('should return standard rate from getMultiplier when inputTokenCount is exactly at threshold', () => {
expect(
getMultiplier({ model: premiumModel, tokenType: 'prompt', inputTokenCount: threshold }),
).toBe(tokenValues[premiumModel].prompt);
});
it('should return premium rate from getMultiplier when inputTokenCount is one above threshold', () => {
expect(
getMultiplier({ model: premiumModel, tokenType: 'prompt', inputTokenCount: aboveThreshold }),
).toBe(premiumEntry.prompt);
});
it('should not apply premium pricing to models without premium entries', () => {
expect(
getMultiplier({
model: 'claude-opus-4-5',
tokenType: 'prompt',
inputTokenCount: wellAboveThreshold,
}),
).toBe(tokenValues['claude-opus-4-5'].prompt);
expect(
getMultiplier({
model: 'claude-sonnet-4',
tokenType: 'prompt',
inputTokenCount: wellAboveThreshold,
}),
).toBe(tokenValues['claude-sonnet-4'].prompt);
});
it('should use standard rate when inputTokenCount is not provided', () => {
expect(getMultiplier({ model: premiumModel, tokenType: 'prompt' })).toBe(
tokenValues[premiumModel].prompt,
);
expect(getMultiplier({ model: premiumModel, tokenType: 'completion' })).toBe(
tokenValues[premiumModel].completion,
);
});
it('should apply premium pricing through getMultiplier with valueKey path', () => {
const valueKey = getValueKey(premiumModel);
expect(getMultiplier({ valueKey, tokenType: 'prompt', inputTokenCount: aboveThreshold })).toBe(
premiumEntry.prompt,
);
expect(
getMultiplier({ valueKey, tokenType: 'completion', inputTokenCount: aboveThreshold }),
).toBe(premiumEntry.completion);
});
it('should apply standard pricing through getMultiplier with valueKey path when below threshold', () => {
const valueKey = getValueKey(premiumModel);
expect(getMultiplier({ valueKey, tokenType: 'prompt', inputTokenCount: belowThreshold })).toBe(
tokenValues[premiumModel].prompt,
);
expect(
getMultiplier({ valueKey, tokenType: 'completion', inputTokenCount: belowThreshold }),
).toBe(tokenValues[premiumModel].completion);
});
});
describe('tokens.ts and tx.js sync validation', () => {

View file

@ -34,8 +34,7 @@
},
"homepage": "https://librechat.ai",
"dependencies": {
"@anthropic-ai/sdk": "^0.71.0",
"@anthropic-ai/vertex-sdk": "^0.14.0",
"@anthropic-ai/vertex-sdk": "^0.14.3",
"@aws-sdk/client-bedrock-runtime": "^3.980.0",
"@aws-sdk/client-s3": "^3.980.0",
"@aws-sdk/s3-request-presigner": "^3.758.0",
@ -45,7 +44,7 @@
"@google/genai": "^1.19.0",
"@keyv/redis": "^4.3.3",
"@langchain/core": "^0.3.80",
"@librechat/agents": "^3.1.33",
"@librechat/agents": "^3.1.36",
"@librechat/api": "*",
"@librechat/data-schemas": "*",
"@microsoft/microsoft-graph-client": "^3.0.7",

View file

@ -1,3 +1,4 @@
/** Note: No hard-coded values should be used in this file. */
const { EModelEndpoint } = require('librechat-data-provider');
const {
maxTokensMap,
@ -626,41 +627,45 @@ describe('matchModelName', () => {
describe('Meta Models Tests', () => {
describe('getModelMaxTokens', () => {
test('should return correct tokens for LLaMa 2 models', () => {
expect(getModelMaxTokens('llama2')).toBe(4000);
expect(getModelMaxTokens('llama2.70b')).toBe(4000);
expect(getModelMaxTokens('llama2-13b')).toBe(4000);
expect(getModelMaxTokens('llama2-70b')).toBe(4000);
const llama2Tokens = maxTokensMap[EModelEndpoint.openAI]['llama2'];
expect(getModelMaxTokens('llama2')).toBe(llama2Tokens);
expect(getModelMaxTokens('llama2.70b')).toBe(llama2Tokens);
expect(getModelMaxTokens('llama2-13b')).toBe(llama2Tokens);
expect(getModelMaxTokens('llama2-70b')).toBe(llama2Tokens);
});
test('should return correct tokens for LLaMa 3 models', () => {
expect(getModelMaxTokens('llama3')).toBe(8000);
expect(getModelMaxTokens('llama3.8b')).toBe(8000);
expect(getModelMaxTokens('llama3.70b')).toBe(8000);
expect(getModelMaxTokens('llama3-8b')).toBe(8000);
expect(getModelMaxTokens('llama3-70b')).toBe(8000);
const llama3Tokens = maxTokensMap[EModelEndpoint.openAI]['llama3'];
expect(getModelMaxTokens('llama3')).toBe(llama3Tokens);
expect(getModelMaxTokens('llama3.8b')).toBe(llama3Tokens);
expect(getModelMaxTokens('llama3.70b')).toBe(llama3Tokens);
expect(getModelMaxTokens('llama3-8b')).toBe(llama3Tokens);
expect(getModelMaxTokens('llama3-70b')).toBe(llama3Tokens);
});
test('should return correct tokens for LLaMa 3.1 models', () => {
expect(getModelMaxTokens('llama3.1:8b')).toBe(127500);
expect(getModelMaxTokens('llama3.1:70b')).toBe(127500);
expect(getModelMaxTokens('llama3.1:405b')).toBe(127500);
expect(getModelMaxTokens('llama3-1-8b')).toBe(127500);
expect(getModelMaxTokens('llama3-1-70b')).toBe(127500);
expect(getModelMaxTokens('llama3-1-405b')).toBe(127500);
const llama31Tokens = maxTokensMap[EModelEndpoint.openAI]['llama3.1:8b'];
expect(getModelMaxTokens('llama3.1:8b')).toBe(llama31Tokens);
expect(getModelMaxTokens('llama3.1:70b')).toBe(llama31Tokens);
expect(getModelMaxTokens('llama3.1:405b')).toBe(llama31Tokens);
expect(getModelMaxTokens('llama3-1-8b')).toBe(llama31Tokens);
expect(getModelMaxTokens('llama3-1-70b')).toBe(llama31Tokens);
expect(getModelMaxTokens('llama3-1-405b')).toBe(llama31Tokens);
});
test('should handle partial matches for Meta models', () => {
// Test with full model names
expect(getModelMaxTokens('meta/llama3.1:405b')).toBe(127500);
expect(getModelMaxTokens('meta/llama3.1:70b')).toBe(127500);
expect(getModelMaxTokens('meta/llama3.1:8b')).toBe(127500);
expect(getModelMaxTokens('meta/llama3-1-8b')).toBe(127500);
const llama31Tokens = maxTokensMap[EModelEndpoint.openAI]['llama3.1:8b'];
const llama3Tokens = maxTokensMap[EModelEndpoint.openAI]['llama3'];
const llama2Tokens = maxTokensMap[EModelEndpoint.openAI]['llama2'];
expect(getModelMaxTokens('meta/llama3.1:405b')).toBe(llama31Tokens);
expect(getModelMaxTokens('meta/llama3.1:70b')).toBe(llama31Tokens);
expect(getModelMaxTokens('meta/llama3.1:8b')).toBe(llama31Tokens);
expect(getModelMaxTokens('meta/llama3-1-8b')).toBe(llama31Tokens);
// Test base versions
expect(getModelMaxTokens('meta/llama3.1')).toBe(127500);
expect(getModelMaxTokens('meta/llama3-1')).toBe(127500);
expect(getModelMaxTokens('meta/llama3')).toBe(8000);
expect(getModelMaxTokens('meta/llama2')).toBe(4000);
expect(getModelMaxTokens('meta/llama3.1')).toBe(llama31Tokens);
expect(getModelMaxTokens('meta/llama3-1')).toBe(llama31Tokens);
expect(getModelMaxTokens('meta/llama3')).toBe(llama3Tokens);
expect(getModelMaxTokens('meta/llama2')).toBe(llama2Tokens);
});
test('should match Deepseek model variations', () => {
@ -678,18 +683,33 @@ describe('Meta Models Tests', () => {
);
});
test('should return 128000 context tokens for all DeepSeek models', () => {
expect(getModelMaxTokens('deepseek-chat')).toBe(128000);
expect(getModelMaxTokens('deepseek-reasoner')).toBe(128000);
expect(getModelMaxTokens('deepseek-r1')).toBe(128000);
expect(getModelMaxTokens('deepseek-v3')).toBe(128000);
expect(getModelMaxTokens('deepseek.r1')).toBe(128000);
test('should return correct context tokens for all DeepSeek models', () => {
const deepseekChatTokens = maxTokensMap[EModelEndpoint.openAI]['deepseek-chat'];
expect(getModelMaxTokens('deepseek-chat')).toBe(deepseekChatTokens);
expect(getModelMaxTokens('deepseek-reasoner')).toBe(
maxTokensMap[EModelEndpoint.openAI]['deepseek-reasoner'],
);
expect(getModelMaxTokens('deepseek-r1')).toBe(
maxTokensMap[EModelEndpoint.openAI]['deepseek-r1'],
);
expect(getModelMaxTokens('deepseek-v3')).toBe(
maxTokensMap[EModelEndpoint.openAI]['deepseek'],
);
expect(getModelMaxTokens('deepseek.r1')).toBe(
maxTokensMap[EModelEndpoint.openAI]['deepseek.r1'],
);
});
test('should handle DeepSeek models with provider prefixes', () => {
expect(getModelMaxTokens('deepseek/deepseek-chat')).toBe(128000);
expect(getModelMaxTokens('openrouter/deepseek-reasoner')).toBe(128000);
expect(getModelMaxTokens('openai/deepseek-v3')).toBe(128000);
expect(getModelMaxTokens('deepseek/deepseek-chat')).toBe(
maxTokensMap[EModelEndpoint.openAI]['deepseek-chat'],
);
expect(getModelMaxTokens('openrouter/deepseek-reasoner')).toBe(
maxTokensMap[EModelEndpoint.openAI]['deepseek-reasoner'],
);
expect(getModelMaxTokens('openai/deepseek-v3')).toBe(
maxTokensMap[EModelEndpoint.openAI]['deepseek'],
);
});
});
@ -728,30 +748,38 @@ describe('Meta Models Tests', () => {
const { getModelMaxOutputTokens } = require('@librechat/api');
test('should return correct max output tokens for deepseek-chat', () => {
expect(getModelMaxOutputTokens('deepseek-chat')).toBe(8000);
expect(getModelMaxOutputTokens('deepseek-chat', EModelEndpoint.openAI)).toBe(8000);
expect(getModelMaxOutputTokens('deepseek-chat', EModelEndpoint.custom)).toBe(8000);
const expected = maxOutputTokensMap[EModelEndpoint.openAI]['deepseek-chat'];
expect(getModelMaxOutputTokens('deepseek-chat')).toBe(expected);
expect(getModelMaxOutputTokens('deepseek-chat', EModelEndpoint.openAI)).toBe(expected);
expect(getModelMaxOutputTokens('deepseek-chat', EModelEndpoint.custom)).toBe(expected);
});
test('should return correct max output tokens for deepseek-reasoner', () => {
expect(getModelMaxOutputTokens('deepseek-reasoner')).toBe(64000);
expect(getModelMaxOutputTokens('deepseek-reasoner', EModelEndpoint.openAI)).toBe(64000);
expect(getModelMaxOutputTokens('deepseek-reasoner', EModelEndpoint.custom)).toBe(64000);
const expected = maxOutputTokensMap[EModelEndpoint.openAI]['deepseek-reasoner'];
expect(getModelMaxOutputTokens('deepseek-reasoner')).toBe(expected);
expect(getModelMaxOutputTokens('deepseek-reasoner', EModelEndpoint.openAI)).toBe(expected);
expect(getModelMaxOutputTokens('deepseek-reasoner', EModelEndpoint.custom)).toBe(expected);
});
test('should return correct max output tokens for deepseek-r1', () => {
expect(getModelMaxOutputTokens('deepseek-r1')).toBe(64000);
expect(getModelMaxOutputTokens('deepseek-r1', EModelEndpoint.openAI)).toBe(64000);
const expected = maxOutputTokensMap[EModelEndpoint.openAI]['deepseek-r1'];
expect(getModelMaxOutputTokens('deepseek-r1')).toBe(expected);
expect(getModelMaxOutputTokens('deepseek-r1', EModelEndpoint.openAI)).toBe(expected);
});
test('should return correct max output tokens for deepseek base pattern', () => {
expect(getModelMaxOutputTokens('deepseek')).toBe(8000);
expect(getModelMaxOutputTokens('deepseek-v3')).toBe(8000);
const expected = maxOutputTokensMap[EModelEndpoint.openAI]['deepseek'];
expect(getModelMaxOutputTokens('deepseek')).toBe(expected);
expect(getModelMaxOutputTokens('deepseek-v3')).toBe(expected);
});
test('should handle DeepSeek models with provider prefixes for max output tokens', () => {
expect(getModelMaxOutputTokens('deepseek/deepseek-chat')).toBe(8000);
expect(getModelMaxOutputTokens('openrouter/deepseek-reasoner')).toBe(64000);
expect(getModelMaxOutputTokens('deepseek/deepseek-chat')).toBe(
maxOutputTokensMap[EModelEndpoint.openAI]['deepseek-chat'],
);
expect(getModelMaxOutputTokens('openrouter/deepseek-reasoner')).toBe(
maxOutputTokensMap[EModelEndpoint.openAI]['deepseek-reasoner'],
);
});
});
@ -796,68 +824,90 @@ describe('Meta Models Tests', () => {
describe('Grok Model Tests - Tokens', () => {
describe('getModelMaxTokens', () => {
test('should return correct tokens for Grok vision models', () => {
expect(getModelMaxTokens('grok-2-vision-1212')).toBe(32768);
expect(getModelMaxTokens('grok-2-vision')).toBe(32768);
expect(getModelMaxTokens('grok-2-vision-latest')).toBe(32768);
const grok2VisionTokens = maxTokensMap[EModelEndpoint.openAI]['grok-2-vision'];
expect(getModelMaxTokens('grok-2-vision-1212')).toBe(grok2VisionTokens);
expect(getModelMaxTokens('grok-2-vision')).toBe(grok2VisionTokens);
expect(getModelMaxTokens('grok-2-vision-latest')).toBe(grok2VisionTokens);
});
test('should return correct tokens for Grok beta models', () => {
expect(getModelMaxTokens('grok-vision-beta')).toBe(8192);
expect(getModelMaxTokens('grok-beta')).toBe(131072);
expect(getModelMaxTokens('grok-vision-beta')).toBe(
maxTokensMap[EModelEndpoint.openAI]['grok-vision-beta'],
);
expect(getModelMaxTokens('grok-beta')).toBe(maxTokensMap[EModelEndpoint.openAI]['grok-beta']);
});
test('should return correct tokens for Grok text models', () => {
expect(getModelMaxTokens('grok-2-1212')).toBe(131072);
expect(getModelMaxTokens('grok-2')).toBe(131072);
expect(getModelMaxTokens('grok-2-latest')).toBe(131072);
const grok2Tokens = maxTokensMap[EModelEndpoint.openAI]['grok-2'];
expect(getModelMaxTokens('grok-2-1212')).toBe(grok2Tokens);
expect(getModelMaxTokens('grok-2')).toBe(grok2Tokens);
expect(getModelMaxTokens('grok-2-latest')).toBe(grok2Tokens);
});
test('should return correct tokens for Grok 3 series models', () => {
expect(getModelMaxTokens('grok-3')).toBe(131072);
expect(getModelMaxTokens('grok-3-fast')).toBe(131072);
expect(getModelMaxTokens('grok-3-mini')).toBe(131072);
expect(getModelMaxTokens('grok-3-mini-fast')).toBe(131072);
expect(getModelMaxTokens('grok-3')).toBe(maxTokensMap[EModelEndpoint.openAI]['grok-3']);
expect(getModelMaxTokens('grok-3-fast')).toBe(
maxTokensMap[EModelEndpoint.openAI]['grok-3-fast'],
);
expect(getModelMaxTokens('grok-3-mini')).toBe(
maxTokensMap[EModelEndpoint.openAI]['grok-3-mini'],
);
expect(getModelMaxTokens('grok-3-mini-fast')).toBe(
maxTokensMap[EModelEndpoint.openAI]['grok-3-mini-fast'],
);
});
test('should return correct tokens for Grok 4 model', () => {
expect(getModelMaxTokens('grok-4-0709')).toBe(256000);
expect(getModelMaxTokens('grok-4-0709')).toBe(maxTokensMap[EModelEndpoint.openAI]['grok-4']);
});
test('should return correct tokens for Grok 4 Fast and Grok 4.1 Fast models', () => {
expect(getModelMaxTokens('grok-4-fast')).toBe(2000000);
expect(getModelMaxTokens('grok-4-1-fast-reasoning')).toBe(2000000);
expect(getModelMaxTokens('grok-4-1-fast-non-reasoning')).toBe(2000000);
const grok4FastTokens = maxTokensMap[EModelEndpoint.openAI]['grok-4-fast'];
const grok41FastTokens = maxTokensMap[EModelEndpoint.openAI]['grok-4-1-fast'];
expect(getModelMaxTokens('grok-4-fast')).toBe(grok4FastTokens);
expect(getModelMaxTokens('grok-4-1-fast-reasoning')).toBe(grok41FastTokens);
expect(getModelMaxTokens('grok-4-1-fast-non-reasoning')).toBe(grok41FastTokens);
});
test('should return correct tokens for Grok Code Fast model', () => {
expect(getModelMaxTokens('grok-code-fast-1')).toBe(256000);
expect(getModelMaxTokens('grok-code-fast-1')).toBe(
maxTokensMap[EModelEndpoint.openAI]['grok-code-fast'],
);
});
test('should handle partial matches for Grok models with prefixes', () => {
// Vision models should match before general models
expect(getModelMaxTokens('xai/grok-2-vision-1212')).toBe(32768);
expect(getModelMaxTokens('xai/grok-2-vision')).toBe(32768);
expect(getModelMaxTokens('xai/grok-2-vision-latest')).toBe(32768);
// Beta models
expect(getModelMaxTokens('xai/grok-vision-beta')).toBe(8192);
expect(getModelMaxTokens('xai/grok-beta')).toBe(131072);
// Text models
expect(getModelMaxTokens('xai/grok-2-1212')).toBe(131072);
expect(getModelMaxTokens('xai/grok-2')).toBe(131072);
expect(getModelMaxTokens('xai/grok-2-latest')).toBe(131072);
// Grok 3 models
expect(getModelMaxTokens('xai/grok-3')).toBe(131072);
expect(getModelMaxTokens('xai/grok-3-fast')).toBe(131072);
expect(getModelMaxTokens('xai/grok-3-mini')).toBe(131072);
expect(getModelMaxTokens('xai/grok-3-mini-fast')).toBe(131072);
// Grok 4 model
expect(getModelMaxTokens('xai/grok-4-0709')).toBe(256000);
// Grok 4 Fast and 4.1 Fast models
expect(getModelMaxTokens('xai/grok-4-fast')).toBe(2000000);
expect(getModelMaxTokens('xai/grok-4-1-fast-reasoning')).toBe(2000000);
expect(getModelMaxTokens('xai/grok-4-1-fast-non-reasoning')).toBe(2000000);
// Grok Code Fast model
expect(getModelMaxTokens('xai/grok-code-fast-1')).toBe(256000);
const grok2VisionTokens = maxTokensMap[EModelEndpoint.openAI]['grok-2-vision'];
const grokVisionBetaTokens = maxTokensMap[EModelEndpoint.openAI]['grok-vision-beta'];
const grokBetaTokens = maxTokensMap[EModelEndpoint.openAI]['grok-beta'];
const grok2Tokens = maxTokensMap[EModelEndpoint.openAI]['grok-2'];
const grok3Tokens = maxTokensMap[EModelEndpoint.openAI]['grok-3'];
const grok4Tokens = maxTokensMap[EModelEndpoint.openAI]['grok-4'];
const grok4FastTokens = maxTokensMap[EModelEndpoint.openAI]['grok-4-fast'];
const grok41FastTokens = maxTokensMap[EModelEndpoint.openAI]['grok-4-1-fast'];
const grokCodeFastTokens = maxTokensMap[EModelEndpoint.openAI]['grok-code-fast'];
expect(getModelMaxTokens('xai/grok-2-vision-1212')).toBe(grok2VisionTokens);
expect(getModelMaxTokens('xai/grok-2-vision')).toBe(grok2VisionTokens);
expect(getModelMaxTokens('xai/grok-2-vision-latest')).toBe(grok2VisionTokens);
expect(getModelMaxTokens('xai/grok-vision-beta')).toBe(grokVisionBetaTokens);
expect(getModelMaxTokens('xai/grok-beta')).toBe(grokBetaTokens);
expect(getModelMaxTokens('xai/grok-2-1212')).toBe(grok2Tokens);
expect(getModelMaxTokens('xai/grok-2')).toBe(grok2Tokens);
expect(getModelMaxTokens('xai/grok-2-latest')).toBe(grok2Tokens);
expect(getModelMaxTokens('xai/grok-3')).toBe(grok3Tokens);
expect(getModelMaxTokens('xai/grok-3-fast')).toBe(
maxTokensMap[EModelEndpoint.openAI]['grok-3-fast'],
);
expect(getModelMaxTokens('xai/grok-3-mini')).toBe(
maxTokensMap[EModelEndpoint.openAI]['grok-3-mini'],
);
expect(getModelMaxTokens('xai/grok-3-mini-fast')).toBe(
maxTokensMap[EModelEndpoint.openAI]['grok-3-mini-fast'],
);
expect(getModelMaxTokens('xai/grok-4-0709')).toBe(grok4Tokens);
expect(getModelMaxTokens('xai/grok-4-fast')).toBe(grok4FastTokens);
expect(getModelMaxTokens('xai/grok-4-1-fast-reasoning')).toBe(grok41FastTokens);
expect(getModelMaxTokens('xai/grok-4-1-fast-non-reasoning')).toBe(grok41FastTokens);
expect(getModelMaxTokens('xai/grok-code-fast-1')).toBe(grokCodeFastTokens);
});
});
@ -1062,6 +1112,56 @@ describe('Claude Model Tests', () => {
expect(matchModelName(model, EModelEndpoint.anthropic)).toBe(expectedModel);
});
});
it('should return correct context length for Claude Opus 4.6 (1M)', () => {
expect(getModelMaxTokens('claude-opus-4-6', EModelEndpoint.anthropic)).toBe(
maxTokensMap[EModelEndpoint.anthropic]['claude-opus-4-6'],
);
expect(getModelMaxTokens('claude-opus-4-6')).toBe(
maxTokensMap[EModelEndpoint.anthropic]['claude-opus-4-6'],
);
});
it('should return correct max output tokens for Claude Opus 4.6 (128K)', () => {
const { getModelMaxOutputTokens } = require('@librechat/api');
expect(getModelMaxOutputTokens('claude-opus-4-6', EModelEndpoint.anthropic)).toBe(
maxOutputTokensMap[EModelEndpoint.anthropic]['claude-opus-4-6'],
);
});
it('should handle Claude Opus 4.6 model name variations', () => {
const modelVariations = [
'claude-opus-4-6',
'claude-opus-4-6-20250801',
'claude-opus-4-6-latest',
'anthropic/claude-opus-4-6',
'claude-opus-4-6/anthropic',
'claude-opus-4-6-preview',
];
modelVariations.forEach((model) => {
const modelKey = findMatchingPattern(model, maxTokensMap[EModelEndpoint.anthropic]);
expect(modelKey).toBe('claude-opus-4-6');
expect(getModelMaxTokens(model, EModelEndpoint.anthropic)).toBe(
maxTokensMap[EModelEndpoint.anthropic]['claude-opus-4-6'],
);
});
});
it('should match model names correctly for Claude Opus 4.6', () => {
const modelVariations = [
'claude-opus-4-6',
'claude-opus-4-6-20250801',
'claude-opus-4-6-latest',
'anthropic/claude-opus-4-6',
'claude-opus-4-6/anthropic',
'claude-opus-4-6-preview',
];
modelVariations.forEach((model) => {
expect(matchModelName(model, EModelEndpoint.anthropic)).toBe('claude-opus-4-6');
});
});
});
describe('Moonshot/Kimi Model Tests', () => {
@ -1329,44 +1429,80 @@ describe('Qwen3 Model Tests', () => {
describe('GLM Model Tests (Zhipu AI)', () => {
describe('getModelMaxTokens', () => {
test('should return correct tokens for GLM models', () => {
expect(getModelMaxTokens('glm-4.6')).toBe(200000);
expect(getModelMaxTokens('glm-4.5v')).toBe(66000);
expect(getModelMaxTokens('glm-4.5-air')).toBe(131000);
expect(getModelMaxTokens('glm-4.5')).toBe(131000);
expect(getModelMaxTokens('glm-4-32b')).toBe(128000);
expect(getModelMaxTokens('glm-4')).toBe(128000);
expect(getModelMaxTokens('glm4')).toBe(128000);
expect(getModelMaxTokens('glm-4.6')).toBe(maxTokensMap[EModelEndpoint.openAI]['glm-4.6']);
expect(getModelMaxTokens('glm-4.5v')).toBe(maxTokensMap[EModelEndpoint.openAI]['glm-4.5v']);
expect(getModelMaxTokens('glm-4.5-air')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4.5-air'],
);
expect(getModelMaxTokens('glm-4.5')).toBe(maxTokensMap[EModelEndpoint.openAI]['glm-4.5']);
expect(getModelMaxTokens('glm-4-32b')).toBe(maxTokensMap[EModelEndpoint.openAI]['glm-4-32b']);
expect(getModelMaxTokens('glm-4')).toBe(maxTokensMap[EModelEndpoint.openAI]['glm-4']);
expect(getModelMaxTokens('glm4')).toBe(maxTokensMap[EModelEndpoint.openAI]['glm4']);
});
test('should handle partial matches for GLM models with provider prefixes', () => {
expect(getModelMaxTokens('z-ai/glm-4.6')).toBe(200000);
expect(getModelMaxTokens('z-ai/glm-4.5')).toBe(131000);
expect(getModelMaxTokens('z-ai/glm-4.5-air')).toBe(131000);
expect(getModelMaxTokens('z-ai/glm-4.5v')).toBe(66000);
expect(getModelMaxTokens('z-ai/glm-4-32b')).toBe(128000);
expect(getModelMaxTokens('z-ai/glm-4.6')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4.6'],
);
expect(getModelMaxTokens('z-ai/glm-4.5')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4.5'],
);
expect(getModelMaxTokens('z-ai/glm-4.5-air')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4.5-air'],
);
expect(getModelMaxTokens('z-ai/glm-4.5v')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4.5v'],
);
expect(getModelMaxTokens('z-ai/glm-4-32b')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4-32b'],
);
expect(getModelMaxTokens('zai/glm-4.6')).toBe(200000);
expect(getModelMaxTokens('zai/glm-4.5')).toBe(131000);
expect(getModelMaxTokens('zai/glm-4.5-air')).toBe(131000);
expect(getModelMaxTokens('zai/glm-4.5v')).toBe(66000);
expect(getModelMaxTokens('zai/glm-4.6')).toBe(maxTokensMap[EModelEndpoint.openAI]['glm-4.6']);
expect(getModelMaxTokens('zai/glm-4.5')).toBe(maxTokensMap[EModelEndpoint.openAI]['glm-4.5']);
expect(getModelMaxTokens('zai/glm-4.5-air')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4.5-air'],
);
expect(getModelMaxTokens('zai/glm-4.5v')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4.5v'],
);
expect(getModelMaxTokens('zai-org/GLM-4.6')).toBe(200000);
expect(getModelMaxTokens('zai-org/GLM-4.5')).toBe(131000);
expect(getModelMaxTokens('zai-org/GLM-4.5-Air')).toBe(131000);
expect(getModelMaxTokens('zai-org/GLM-4.5V')).toBe(66000);
expect(getModelMaxTokens('zai-org/GLM-4-32B-0414')).toBe(128000);
expect(getModelMaxTokens('zai-org/GLM-4.6')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4.6'],
);
expect(getModelMaxTokens('zai-org/GLM-4.5')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4.5'],
);
expect(getModelMaxTokens('zai-org/GLM-4.5-Air')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4.5-air'],
);
expect(getModelMaxTokens('zai-org/GLM-4.5V')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4.5v'],
);
expect(getModelMaxTokens('zai-org/GLM-4-32B-0414')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4-32b'],
);
});
test('should handle GLM model variations with suffixes', () => {
expect(getModelMaxTokens('glm-4.6-fp8')).toBe(200000);
expect(getModelMaxTokens('zai-org/GLM-4.6-FP8')).toBe(200000);
expect(getModelMaxTokens('zai-org/GLM-4.5-Air-FP8')).toBe(131000);
expect(getModelMaxTokens('glm-4.6-fp8')).toBe(maxTokensMap[EModelEndpoint.openAI]['glm-4.6']);
expect(getModelMaxTokens('zai-org/GLM-4.6-FP8')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4.6'],
);
expect(getModelMaxTokens('zai-org/GLM-4.5-Air-FP8')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4.5-air'],
);
});
test('should prioritize more specific GLM patterns', () => {
expect(getModelMaxTokens('glm-4.5-air-custom')).toBe(131000);
expect(getModelMaxTokens('glm-4.5-custom')).toBe(131000);
expect(getModelMaxTokens('glm-4.5v-custom')).toBe(66000);
expect(getModelMaxTokens('glm-4.5-air-custom')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4.5-air'],
);
expect(getModelMaxTokens('glm-4.5-custom')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4.5'],
);
expect(getModelMaxTokens('glm-4.5v-custom')).toBe(
maxTokensMap[EModelEndpoint.openAI]['glm-4.5v'],
);
});
});

View file

@ -148,7 +148,6 @@
"jest-file-loader": "^1.0.3",
"jest-junit": "^16.0.0",
"postcss": "^8.4.31",
"postcss-loader": "^7.1.0",
"postcss-preset-env": "^8.2.0",
"tailwindcss": "^3.4.1",
"typescript": "^5.3.3",

View file

@ -224,10 +224,11 @@
"com_endpoint_agent": "Agent",
"com_endpoint_agent_placeholder": "Please select an Agent",
"com_endpoint_ai": "AI",
"com_endpoint_anthropic_effort": "Controls how much computational effort Claude applies. Lower effort saves tokens and reduces latency; higher effort produces more thorough responses. 'Max' enables the deepest reasoning (Opus 4.6 only).",
"com_endpoint_anthropic_maxoutputtokens": "Maximum number of tokens that can be generated in the response. Specify a lower value for shorter responses and a higher value for longer responses. Note: models may stop before reaching this maximum.",
"com_endpoint_anthropic_prompt_cache": "Prompt caching allows reusing large context or instructions across API calls, reducing costs and latency",
"com_endpoint_anthropic_temp": "Ranges from 0 to 1. Use temp closer to 0 for analytical / multiple choice, and closer to 1 for creative and generative tasks. We recommend altering this or Top P but not both.",
"com_endpoint_anthropic_thinking": "Enables internal reasoning for supported Claude models (3.7 Sonnet). Note: requires \"Thinking Budget\" to be set and lower than \"Max Output Tokens\"",
"com_endpoint_anthropic_thinking": "Enables internal reasoning for supported Claude models. For Opus 4.6, uses adaptive thinking controlled by the Effort parameter. For other models, requires \"Thinking Budget\" to be set and lower than \"Max Output Tokens\".",
"com_endpoint_anthropic_thinking_budget": "Determines the max number of tokens Claude is allowed use for its internal reasoning process. Larger budgets can improve response quality by enabling more thorough analysis for complex problems, although Claude may not use the entire budget allocated, especially at ranges above 32K. This setting must be lower than \"Max Output Tokens.\"",
"com_endpoint_anthropic_topk": "Top-k changes how the model selects tokens for output. A top-k of 1 means the selected token is the most probable among all tokens in the model's vocabulary (also called greedy decoding), while a top-k of 3 means that the next token is selected from among the 3 most probable tokens (using temperature).",
"com_endpoint_anthropic_topp": "Top-p changes how the model selects tokens for output. Tokens are selected from most K (see topK parameter) probable to least until the sum of their probabilities equals the top-p value.",
@ -265,6 +266,7 @@
"com_endpoint_default_with_num": "default: {{0}}",
"com_endpoint_disable_streaming": "Disable streaming responses and receive the complete response at once. Useful for models like o3 that require organization verification for streaming",
"com_endpoint_disable_streaming_label": "Disable Streaming",
"com_endpoint_effort": "Effort",
"com_endpoint_examples": " Presets",
"com_endpoint_export": "Export",
"com_endpoint_export_share": "Export/Share",
@ -1082,6 +1084,7 @@
"com_ui_manage": "Manage",
"com_ui_marketplace": "Marketplace",
"com_ui_marketplace_allow_use": "Allow using Marketplace",
"com_ui_max": "Max",
"com_ui_max_favorites_reached": "Maximum pinned items reached ({{0}}). Unpin an item to add more.",
"com_ui_max_file_size": "PNG, JPG or JPEG (max {{0}})",
"com_ui_max_tags": "Maximum number allowed is {{0}}, using latest values.",

1402
package-lock.json generated

File diff suppressed because it is too large Load diff

View file

@ -131,6 +131,15 @@
"typescript-eslint": "^8.24.0"
},
"overrides": {
"@anthropic-ai/sdk": "0.73.0",
"@librechat/agents": {
"@langchain/anthropic": {
"@anthropic-ai/sdk": "0.73.0",
"fast-xml-parser": "5.3.4"
},
"@anthropic-ai/sdk": "0.73.0",
"fast-xml-parser": "5.3.4"
},
"axios": "1.12.1",
"elliptic": "^6.6.1",
"fast-xml-parser": "5.3.4",

View file

@ -78,7 +78,7 @@
"registry": "https://registry.npmjs.org/"
},
"peerDependencies": {
"@anthropic-ai/vertex-sdk": "^0.14.0",
"@anthropic-ai/vertex-sdk": "^0.14.3",
"@aws-sdk/client-bedrock-runtime": "^3.970.0",
"@aws-sdk/client-s3": "^3.980.0",
"@azure/identity": "^4.7.0",
@ -87,7 +87,7 @@
"@google/genai": "^1.19.0",
"@keyv/redis": "^4.3.3",
"@langchain/core": "^0.3.80",
"@librechat/agents": "^3.1.33",
"@librechat/agents": "^3.1.36",
"@librechat/data-schemas": "*",
"@modelcontextprotocol/sdk": "^1.26.0",
"@smithy/node-http-handler": "^4.4.5",

View file

@ -1,6 +1,12 @@
import { logger } from '@librechat/data-schemas';
import { AnthropicClientOptions } from '@librechat/agents';
import { EModelEndpoint, anthropicSettings } from 'librechat-data-provider';
import {
EModelEndpoint,
AnthropicEffort,
anthropicSettings,
supportsContext1m,
supportsAdaptiveThinking,
} from 'librechat-data-provider';
import { matchModelName } from '~/utils/tokens';
/**
@ -48,7 +54,7 @@ function getClaudeHeaders(
return {
'anthropic-beta': 'token-efficient-tools-2025-02-19,output-128k-2025-02-19',
};
} else if (/claude-sonnet-4/.test(model)) {
} else if (supportsContext1m(model)) {
return {
'anthropic-beta': 'context-1m-2025-08-07',
};
@ -58,25 +64,43 @@ function getClaudeHeaders(
}
/**
* Configures reasoning-related options for Claude models
* @param {AnthropicClientOptions & { max_tokens?: number }} anthropicInput The request options object
* @param {Object} extendedOptions Additional client configuration options
* @param {boolean} extendedOptions.thinking Whether thinking is enabled in client config
* @param {number|null} extendedOptions.thinkingBudget The token budget for thinking
* @returns {Object} Updated request options
* Configures reasoning-related options for Claude models.
* Models supporting adaptive thinking (Opus 4.6+, Sonnet 5+) use effort control instead of manual budget_tokens.
*/
function configureReasoning(
anthropicInput: AnthropicClientOptions & { max_tokens?: number },
extendedOptions: { thinking?: boolean; thinkingBudget?: number | null } = {},
extendedOptions: {
thinking?: boolean;
thinkingBudget?: number | null;
effort?: AnthropicEffort | string | null;
} = {},
): AnthropicClientOptions & { max_tokens?: number } {
const updatedOptions = { ...anthropicInput };
const currentMaxTokens = updatedOptions.max_tokens ?? updatedOptions.maxTokens;
const modelName = updatedOptions.model ?? '';
if (extendedOptions.thinking && modelName && supportsAdaptiveThinking(modelName)) {
updatedOptions.thinking = { type: 'adaptive' };
const effort = extendedOptions.effort;
if (effort && effort !== AnthropicEffort.unset) {
updatedOptions.invocationKwargs = {
...updatedOptions.invocationKwargs,
output_config: { effort },
};
}
if (currentMaxTokens == null) {
updatedOptions.max_tokens = anthropicSettings.maxOutputTokens.reset(modelName);
}
return updatedOptions;
}
if (
extendedOptions.thinking &&
updatedOptions?.model &&
(/claude-3[-.]7/.test(updatedOptions.model) ||
/claude-(?:sonnet|opus|haiku)-[4-9]/.test(updatedOptions.model))
modelName &&
(/claude-3[-.]7/.test(modelName) || /claude-(?:sonnet|opus|haiku)-[4-9]/.test(modelName))
) {
updatedOptions.thinking = {
...updatedOptions.thinking,
@ -100,7 +124,7 @@ function configureReasoning(
updatedOptions.thinking.type === 'enabled' &&
(currentMaxTokens == null || updatedOptions.thinking.budget_tokens > currentMaxTokens)
) {
const maxTokens = anthropicSettings.maxOutputTokens.reset(updatedOptions.model ?? '');
const maxTokens = anthropicSettings.maxOutputTokens.reset(modelName);
updatedOptions.max_tokens = currentMaxTokens ?? maxTokens;
logger.warn(
@ -111,11 +135,11 @@ function configureReasoning(
updatedOptions.thinking.budget_tokens = Math.min(
updatedOptions.thinking.budget_tokens,
Math.floor(updatedOptions.max_tokens * 0.9),
Math.floor((updatedOptions.max_tokens ?? 0) * 0.9),
);
}
return updatedOptions;
}
export { checkPromptCacheSupport, getClaudeHeaders, configureReasoning };
export { checkPromptCacheSupport, getClaudeHeaders, configureReasoning, supportsAdaptiveThinking };

View file

@ -77,13 +77,11 @@ export async function initializeAnthropic({
...(vertexConfig && { vertexConfig }),
};
/** @type {undefined | TBaseEndpoint} */
const anthropicConfig = appConfig?.endpoints?.[EModelEndpoint.anthropic];
const allConfig = appConfig?.endpoints?.all;
const result = getLLMConfig(credentials, clientOptions);
// Apply stream rate delay
if (anthropicConfig?.streamRate) {
(result.llmConfig as Record<string, unknown>)._lc_stream_delay = anthropicConfig.streamRate;
}

View file

@ -1,5 +1,6 @@
import { getLLMConfig } from './llm';
import { AnthropicEffort } from 'librechat-data-provider';
import type * as t from '~/types';
import { getLLMConfig } from './llm';
jest.mock('https-proxy-agent', () => ({
HttpsProxyAgent: jest.fn().mockImplementation((proxy) => ({ proxy })),
@ -835,13 +836,19 @@ describe('getLLMConfig', () => {
expect(result.llmConfig.maxTokens).toBe(32000);
});
// opus-4-5+ get 64K
const opus64kModels = ['claude-opus-4-5', 'claude-opus-4-7', 'claude-opus-4-10'];
opus64kModels.forEach((model) => {
// opus-4-5 gets 64K
const opus64kResult = getLLMConfig('test-key', {
modelOptions: { model: 'claude-opus-4-5' },
});
expect(opus64kResult.llmConfig.maxTokens).toBe(64000);
// opus-4-6+ get 128K
const opus128kModels = ['claude-opus-4-7', 'claude-opus-4-10'];
opus128kModels.forEach((model) => {
const result = getLLMConfig('test-key', {
modelOptions: { model },
});
expect(result.llmConfig.maxTokens).toBe(64000);
expect(result.llmConfig.maxTokens).toBe(128000);
});
});
@ -910,6 +917,126 @@ describe('getLLMConfig', () => {
expect(result.llmConfig.maxTokens).toBe(32000);
});
it('should use adaptive thinking for Opus 4.6 instead of enabled + budget_tokens', () => {
const result = getLLMConfig('test-key', {
modelOptions: {
model: 'claude-opus-4-6',
thinking: true,
thinkingBudget: 10000,
},
});
expect((result.llmConfig.thinking as unknown as { type: string }).type).toBe('adaptive');
expect(result.llmConfig.thinking).not.toHaveProperty('budget_tokens');
expect(result.llmConfig.maxTokens).toBe(128000);
});
it('should set effort via output_config for adaptive thinking models', () => {
const result = getLLMConfig('test-key', {
modelOptions: {
model: 'claude-opus-4-6',
thinking: true,
effort: AnthropicEffort.medium,
},
});
expect((result.llmConfig.thinking as unknown as { type: string }).type).toBe('adaptive');
expect(result.llmConfig.invocationKwargs).toHaveProperty('output_config');
expect(result.llmConfig.invocationKwargs?.output_config).toEqual({
effort: AnthropicEffort.medium,
});
});
it('should set effort via output_config even without thinking for adaptive models', () => {
const result = getLLMConfig('test-key', {
modelOptions: {
model: 'claude-opus-4-6',
thinking: false,
effort: AnthropicEffort.low,
},
});
expect(result.llmConfig.thinking).toBeUndefined();
expect(result.llmConfig.invocationKwargs).toHaveProperty('output_config');
expect(result.llmConfig.invocationKwargs?.output_config).toEqual({
effort: AnthropicEffort.low,
});
});
it('should NOT set adaptive thinking or effort for non-adaptive models', () => {
const nonAdaptiveModels = [
'claude-opus-4-5',
'claude-opus-4-1',
'claude-sonnet-4-5',
'claude-sonnet-4',
'claude-haiku-4-5',
];
nonAdaptiveModels.forEach((model) => {
const result = getLLMConfig('test-key', {
modelOptions: {
model,
thinking: true,
thinkingBudget: 10000,
effort: AnthropicEffort.medium,
},
});
if (result.llmConfig.thinking != null) {
expect((result.llmConfig.thinking as unknown as { type: string }).type).not.toBe(
'adaptive',
);
}
expect(result.llmConfig.invocationKwargs?.output_config).toBeUndefined();
});
});
it('should strip adaptive thinking if it somehow reaches a non-adaptive model', () => {
const result = getLLMConfig('test-key', {
modelOptions: {
model: 'claude-sonnet-4-5',
thinking: true,
thinkingBudget: 5000,
},
});
expect(result.llmConfig.thinking).toMatchObject({
type: 'enabled',
budget_tokens: 5000,
});
expect(result.llmConfig.invocationKwargs?.output_config).toBeUndefined();
});
it('should exclude topP/topK for Opus 4.6 with adaptive thinking', () => {
const result = getLLMConfig('test-key', {
modelOptions: {
model: 'claude-opus-4-6',
thinking: true,
topP: 0.9,
topK: 40,
},
});
expect((result.llmConfig.thinking as unknown as { type: string }).type).toBe('adaptive');
expect(result.llmConfig).not.toHaveProperty('topP');
expect(result.llmConfig).not.toHaveProperty('topK');
});
it('should include topP/topK for Opus 4.6 when thinking is disabled', () => {
const result = getLLMConfig('test-key', {
modelOptions: {
model: 'claude-opus-4-6',
thinking: false,
topP: 0.9,
topK: 40,
},
});
expect(result.llmConfig.thinking).toBeUndefined();
expect(result.llmConfig).toHaveProperty('topP', 0.9);
expect(result.llmConfig).toHaveProperty('topK', 40);
});
it('should respect model-specific maxOutputTokens for Claude 4.x models', () => {
const testCases = [
{ model: 'claude-sonnet-4-5', maxOutputTokens: 50000, expected: 50000 },
@ -960,7 +1087,7 @@ describe('getLLMConfig', () => {
});
});
it('should future-proof Claude 5.x Opus models with 64K default', () => {
it('should future-proof Claude 5.x Opus models with 128K default', () => {
const testCases = [
'claude-opus-5',
'claude-opus-5-0',
@ -972,28 +1099,28 @@ describe('getLLMConfig', () => {
const result = getLLMConfig('test-key', {
modelOptions: { model },
});
expect(result.llmConfig.maxTokens).toBe(64000);
expect(result.llmConfig.maxTokens).toBe(128000);
});
});
it('should future-proof Claude 6-9.x models with correct defaults', () => {
const testCases = [
// Claude 6.x - All get 64K since they're version 5+
// Claude 6.x - Sonnet/Haiku get 64K, Opus gets 128K
{ model: 'claude-sonnet-6', expected: 64000 },
{ model: 'claude-haiku-6-0', expected: 64000 },
{ model: 'claude-opus-6-1', expected: 64000 }, // opus 6+ gets 64K
{ model: 'claude-opus-6-1', expected: 128000 },
// Claude 7.x
{ model: 'claude-sonnet-7-20270101', expected: 64000 },
{ model: 'claude-haiku-7.5', expected: 64000 },
{ model: 'claude-opus-7', expected: 64000 }, // opus 7+ gets 64K
{ model: 'claude-opus-7', expected: 128000 },
// Claude 8.x
{ model: 'claude-sonnet-8', expected: 64000 },
{ model: 'claude-haiku-8-2', expected: 64000 },
{ model: 'claude-opus-8-latest', expected: 64000 }, // opus 8+ gets 64K
{ model: 'claude-opus-8-latest', expected: 128000 },
// Claude 9.x
{ model: 'claude-sonnet-9', expected: 64000 },
{ model: 'claude-haiku-9', expected: 64000 },
{ model: 'claude-opus-9', expected: 64000 }, // opus 9+ gets 64K
{ model: 'claude-opus-9', expected: 128000 },
];
testCases.forEach(({ model, expected }) => {

View file

@ -7,7 +7,12 @@ import type {
AnthropicConfigOptions,
AnthropicCredentials,
} from '~/types/anthropic';
import { checkPromptCacheSupport, getClaudeHeaders, configureReasoning } from './helpers';
import {
supportsAdaptiveThinking,
checkPromptCacheSupport,
configureReasoning,
getClaudeHeaders,
} from './helpers';
import {
createAnthropicVertexClient,
isAnthropicVertexCredentials,
@ -83,15 +88,14 @@ function getLLMConfig(
promptCache: options.modelOptions?.promptCache ?? anthropicSettings.promptCache.default,
thinkingBudget:
options.modelOptions?.thinkingBudget ?? anthropicSettings.thinkingBudget.default,
effort: options.modelOptions?.effort ?? anthropicSettings.effort.default,
};
/** Couldn't figure out a way to still loop through the object while deleting the overlapping keys when porting this
* over from javascript, so for now they are being deleted manually until a better way presents itself.
*/
if (options.modelOptions) {
delete options.modelOptions.thinking;
delete options.modelOptions.promptCache;
delete options.modelOptions.thinkingBudget;
delete options.modelOptions.effort;
} else {
throw new Error('No modelOptions provided');
}
@ -145,10 +149,33 @@ function getLLMConfig(
requestOptions = configureReasoning(requestOptions, systemOptions);
if (!/claude-3[-.]7/.test(mergedOptions.model)) {
requestOptions.topP = mergedOptions.topP;
requestOptions.topK = mergedOptions.topK;
} else if (requestOptions.thinking == null) {
if (supportsAdaptiveThinking(mergedOptions.model)) {
if (
systemOptions.effort &&
(systemOptions.effort as string) !== '' &&
!requestOptions.invocationKwargs?.output_config
) {
requestOptions.invocationKwargs = {
...requestOptions.invocationKwargs,
output_config: { effort: systemOptions.effort },
};
}
} else {
if (
requestOptions.thinking != null &&
(requestOptions.thinking as unknown as { type: string }).type === 'adaptive'
) {
delete requestOptions.thinking;
}
if (requestOptions.invocationKwargs?.output_config) {
delete requestOptions.invocationKwargs.output_config;
}
}
const hasActiveThinking = requestOptions.thinking != null;
const isThinkingModel =
/claude-3[-.]7/.test(mergedOptions.model) || supportsAdaptiveThinking(mergedOptions.model);
if (!isThinkingModel || !hasActiveThinking) {
requestOptions.topP = mergedOptions.topP;
requestOptions.topK = mergedOptions.topK;
}

View file

@ -1,10 +1,10 @@
import path from 'path';
import { AnthropicVertex } from '@anthropic-ai/vertex-sdk';
import { GoogleAuth } from 'google-auth-library';
import { ClientOptions } from '@anthropic-ai/sdk';
import { AuthKeys } from 'librechat-data-provider';
import { loadServiceKey } from '~/utils/key';
import { AnthropicVertex } from '@anthropic-ai/vertex-sdk';
import type { ClientOptions } from '@anthropic-ai/sdk';
import type { AnthropicCredentials, VertexAIClientOptions } from '~/types/anthropic';
import { loadServiceKey } from '~/utils/key';
/**
* Options for loading Vertex AI credentials

View file

@ -613,4 +613,113 @@ describe('initializeBedrock', () => {
expect(result.llmConfig).toHaveProperty('applicationInferenceProfile', inferenceProfileArn);
});
});
describe('Opus 4.6 Adaptive Thinking', () => {
it('should configure adaptive thinking with default maxTokens for Opus 4.6', async () => {
const params = createMockParams({
model_parameters: {
model: 'anthropic.claude-opus-4-6-v1',
},
});
const result = (await initializeBedrock(params)) as BedrockLLMConfigResult;
const amrf = result.llmConfig.additionalModelRequestFields as Record<string, unknown>;
expect(amrf.thinking).toEqual({ type: 'adaptive' });
expect(result.llmConfig.maxTokens).toBe(16000);
expect(amrf.anthropic_beta).toEqual(
expect.arrayContaining(['output-128k-2025-02-19', 'context-1m-2025-08-07']),
);
});
it('should pass effort via output_config for Opus 4.6', async () => {
const params = createMockParams({
model_parameters: {
model: 'anthropic.claude-opus-4-6-v1',
effort: 'medium',
},
});
const result = (await initializeBedrock(params)) as BedrockLLMConfigResult;
const amrf = result.llmConfig.additionalModelRequestFields as Record<string, unknown>;
expect(amrf.thinking).toEqual({ type: 'adaptive' });
expect(amrf.output_config).toEqual({ effort: 'medium' });
});
it('should respect user-provided maxTokens for Opus 4.6', async () => {
const params = createMockParams({
model_parameters: {
model: 'anthropic.claude-opus-4-6-v1',
maxTokens: 32000,
},
});
const result = (await initializeBedrock(params)) as BedrockLLMConfigResult;
expect(result.llmConfig.maxTokens).toBe(32000);
});
it('should handle cross-region Opus 4.6 model IDs', async () => {
const params = createMockParams({
model_parameters: {
model: 'us.anthropic.claude-opus-4-6-v1',
effort: 'low',
},
});
const result = (await initializeBedrock(params)) as BedrockLLMConfigResult;
const amrf = result.llmConfig.additionalModelRequestFields as Record<string, unknown>;
expect(result.llmConfig).toHaveProperty('model', 'us.anthropic.claude-opus-4-6-v1');
expect(amrf.thinking).toEqual({ type: 'adaptive' });
expect(amrf.output_config).toEqual({ effort: 'low' });
});
it('should use enabled thinking for non-adaptive models (Sonnet 4.5)', async () => {
const params = createMockParams({
model_parameters: {
model: 'anthropic.claude-sonnet-4-5-20250929-v1:0',
},
});
const result = (await initializeBedrock(params)) as BedrockLLMConfigResult;
const amrf = result.llmConfig.additionalModelRequestFields as Record<string, unknown>;
expect(amrf.thinking).toEqual({ type: 'enabled', budget_tokens: 2000 });
expect(amrf.output_config).toBeUndefined();
expect(result.llmConfig.maxTokens).toBe(8192);
});
it('should not include output_config when effort is empty', async () => {
const params = createMockParams({
model_parameters: {
model: 'anthropic.claude-opus-4-6-v1',
effort: '',
},
});
const result = (await initializeBedrock(params)) as BedrockLLMConfigResult;
const amrf = result.llmConfig.additionalModelRequestFields as Record<string, unknown>;
expect(amrf.thinking).toEqual({ type: 'adaptive' });
expect(amrf.output_config).toBeUndefined();
});
it('should strip effort for non-adaptive models', async () => {
const params = createMockParams({
model_parameters: {
model: 'anthropic.claude-opus-4-1-20250805-v1:0',
effort: 'high',
},
});
const result = (await initializeBedrock(params)) as BedrockLLMConfigResult;
const amrf = result.llmConfig.additionalModelRequestFields as Record<string, unknown>;
expect(amrf.thinking).toEqual({ type: 'enabled', budget_tokens: 2000 });
expect(amrf.output_config).toBeUndefined();
expect(amrf.effort).toBeUndefined();
});
});
});

View file

@ -44,6 +44,10 @@ export interface ThinkingConfigEnabled {
type: 'enabled';
}
export interface ThinkingConfigAdaptive {
type: 'adaptive';
}
/**
* Configuration for enabling Claude's extended thinking.
*
@ -55,7 +59,10 @@ export interface ThinkingConfigEnabled {
* [extended thinking](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking)
* for details.
*/
export type ThinkingConfigParam = ThinkingConfigEnabled | ThinkingConfigDisabled;
export type ThinkingConfigParam =
| ThinkingConfigEnabled
| ThinkingConfigDisabled
| ThinkingConfigAdaptive;
export type AnthropicModelOptions = Partial<Omit<AnthropicParameters, 'thinking'>> & {
thinking?: AnthropicParameters['thinking'] | null;

View file

@ -151,6 +151,7 @@ const anthropicModels = {
'claude-4': 200000,
'claude-opus-4': 200000,
'claude-opus-4-5': 200000,
'claude-opus-4-6': 1000000,
};
const deepseekModels = {
@ -394,6 +395,7 @@ const anthropicMaxOutputs = {
'claude-sonnet-4': 64000,
'claude-opus-4': 32000,
'claude-opus-4-5': 64000,
'claude-opus-4-6': 128000,
'claude-3.5-sonnet': 8192,
'claude-3-5-sonnet': 8192,
'claude-3.7-sonnet': 128000,

View file

@ -1,5 +1,161 @@
import { bedrockInputParser } from '../src/bedrock';
import type { BedrockConverseInput } from '../src/bedrock';
import {
supportsAdaptiveThinking,
bedrockOutputParser,
bedrockInputParser,
supportsContext1m,
} from '../src/bedrock';
describe('supportsAdaptiveThinking', () => {
test('should return true for claude-opus-4-6', () => {
expect(supportsAdaptiveThinking('claude-opus-4-6')).toBe(true);
});
test('should return true for claude-opus-4.6', () => {
expect(supportsAdaptiveThinking('claude-opus-4.6')).toBe(true);
});
test('should return true for claude-opus-4-7 (future)', () => {
expect(supportsAdaptiveThinking('claude-opus-4-7')).toBe(true);
});
test('should return true for claude-opus-5 (future)', () => {
expect(supportsAdaptiveThinking('claude-opus-5')).toBe(true);
});
test('should return true for claude-opus-9 (future)', () => {
expect(supportsAdaptiveThinking('claude-opus-9')).toBe(true);
});
test('should return true for claude-sonnet-5 (future)', () => {
expect(supportsAdaptiveThinking('claude-sonnet-5')).toBe(true);
});
test('should return true for claude-sonnet-6 (future)', () => {
expect(supportsAdaptiveThinking('claude-sonnet-6')).toBe(true);
});
test('should return false for claude-opus-4-5', () => {
expect(supportsAdaptiveThinking('claude-opus-4-5')).toBe(false);
});
test('should return false for claude-opus-4', () => {
expect(supportsAdaptiveThinking('claude-opus-4')).toBe(false);
});
test('should return false for claude-opus-4-0', () => {
expect(supportsAdaptiveThinking('claude-opus-4-0')).toBe(false);
});
test('should return false for claude-sonnet-4-5', () => {
expect(supportsAdaptiveThinking('claude-sonnet-4-5')).toBe(false);
});
test('should return false for claude-sonnet-4', () => {
expect(supportsAdaptiveThinking('claude-sonnet-4')).toBe(false);
});
test('should return false for claude-3-7-sonnet', () => {
expect(supportsAdaptiveThinking('claude-3-7-sonnet')).toBe(false);
});
test('should return false for unrelated model', () => {
expect(supportsAdaptiveThinking('gpt-4o')).toBe(false);
});
test('should handle Bedrock model ID with prefix stripping', () => {
expect(supportsAdaptiveThinking('anthropic.claude-opus-4-6-v1:0')).toBe(true);
});
test('should handle cross-region Bedrock model ID', () => {
expect(supportsAdaptiveThinking('us.anthropic.claude-opus-4-6-v1')).toBe(true);
});
test('should return true for claude-4-6-opus (alternate naming)', () => {
expect(supportsAdaptiveThinking('claude-4-6-opus')).toBe(true);
});
test('should return true for anthropic.claude-4-6-opus (alternate naming)', () => {
expect(supportsAdaptiveThinking('anthropic.claude-4-6-opus')).toBe(true);
});
test('should return true for claude-5-sonnet (alternate naming)', () => {
expect(supportsAdaptiveThinking('claude-5-sonnet')).toBe(true);
});
test('should return true for anthropic.claude-5-sonnet (alternate naming)', () => {
expect(supportsAdaptiveThinking('anthropic.claude-5-sonnet')).toBe(true);
});
test('should return false for claude-4-5-opus (alternate naming, below threshold)', () => {
expect(supportsAdaptiveThinking('claude-4-5-opus')).toBe(false);
});
test('should return false for claude-4-sonnet (alternate naming, below threshold)', () => {
expect(supportsAdaptiveThinking('claude-4-sonnet')).toBe(false);
});
});
describe('supportsContext1m', () => {
test('should return true for claude-sonnet-4', () => {
expect(supportsContext1m('claude-sonnet-4')).toBe(true);
});
test('should return true for claude-sonnet-4-5', () => {
expect(supportsContext1m('claude-sonnet-4-5')).toBe(true);
});
test('should return true for claude-sonnet-5 (future)', () => {
expect(supportsContext1m('claude-sonnet-5')).toBe(true);
});
test('should return true for claude-opus-4-6', () => {
expect(supportsContext1m('claude-opus-4-6')).toBe(true);
});
test('should return true for claude-opus-5 (future)', () => {
expect(supportsContext1m('claude-opus-5')).toBe(true);
});
test('should return false for claude-opus-4-5', () => {
expect(supportsContext1m('claude-opus-4-5')).toBe(false);
});
test('should return false for claude-opus-4', () => {
expect(supportsContext1m('claude-opus-4')).toBe(false);
});
test('should return false for claude-3-7-sonnet', () => {
expect(supportsContext1m('claude-3-7-sonnet')).toBe(false);
});
test('should return false for claude-sonnet-3', () => {
expect(supportsContext1m('claude-sonnet-3')).toBe(false);
});
test('should return false for unrelated model', () => {
expect(supportsContext1m('gpt-4o')).toBe(false);
});
test('should return true for claude-4-sonnet (alternate naming)', () => {
expect(supportsContext1m('claude-4-sonnet')).toBe(true);
});
test('should return true for claude-5-sonnet (alternate naming)', () => {
expect(supportsContext1m('claude-5-sonnet')).toBe(true);
});
test('should return true for claude-4-6-opus (alternate naming)', () => {
expect(supportsContext1m('claude-4-6-opus')).toBe(true);
});
test('should return false for claude-3-sonnet (alternate naming, below threshold)', () => {
expect(supportsContext1m('claude-3-sonnet')).toBe(false);
});
test('should return false for claude-4-5-opus (alternate naming, below threshold)', () => {
expect(supportsContext1m('claude-4-5-opus')).toBe(false);
});
});
describe('bedrockInputParser', () => {
describe('Model Matching for Reasoning Configuration', () => {
@ -7,7 +163,7 @@ describe('bedrockInputParser', () => {
const input = {
model: 'anthropic.claude-3-7-sonnet',
};
const result = bedrockInputParser.parse(input) as BedrockConverseInput;
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toBe(true);
expect(additionalFields.thinkingBudget).toBe(2000);
@ -18,7 +174,7 @@ describe('bedrockInputParser', () => {
const input = {
model: 'anthropic.claude-sonnet-4',
};
const result = bedrockInputParser.parse(input) as BedrockConverseInput;
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toBe(true);
expect(additionalFields.thinkingBudget).toBe(2000);
@ -28,22 +184,25 @@ describe('bedrockInputParser', () => {
]);
});
test('should match anthropic.claude-opus-5 model without 1M context header', () => {
test('should match anthropic.claude-opus-5 model with adaptive thinking and 1M context', () => {
const input = {
model: 'anthropic.claude-opus-5',
};
const result = bedrockInputParser.parse(input) as BedrockConverseInput;
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toBe(true);
expect(additionalFields.thinkingBudget).toBe(2000);
expect(additionalFields.anthropic_beta).toEqual(['output-128k-2025-02-19']);
expect(additionalFields.thinking).toEqual({ type: 'adaptive' });
expect(additionalFields.thinkingBudget).toBeUndefined();
expect(additionalFields.anthropic_beta).toEqual([
'output-128k-2025-02-19',
'context-1m-2025-08-07',
]);
});
test('should match anthropic.claude-haiku-6 model without 1M context header', () => {
const input = {
model: 'anthropic.claude-haiku-6',
};
const result = bedrockInputParser.parse(input) as BedrockConverseInput;
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toBe(true);
expect(additionalFields.thinkingBudget).toBe(2000);
@ -54,7 +213,7 @@ describe('bedrockInputParser', () => {
const input = {
model: 'anthropic.claude-4-sonnet',
};
const result = bedrockInputParser.parse(input) as BedrockConverseInput;
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toBe(true);
expect(additionalFields.thinkingBudget).toBe(2000);
@ -68,7 +227,7 @@ describe('bedrockInputParser', () => {
const input = {
model: 'anthropic.claude-4.5-sonnet',
};
const result = bedrockInputParser.parse(input) as BedrockConverseInput;
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toBe(true);
expect(additionalFields.thinkingBudget).toBe(2000);
@ -82,7 +241,7 @@ describe('bedrockInputParser', () => {
const input = {
model: 'anthropic.claude-4-7-sonnet',
};
const result = bedrockInputParser.parse(input) as BedrockConverseInput;
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toBe(true);
expect(additionalFields.thinkingBudget).toBe(2000);
@ -96,7 +255,7 @@ describe('bedrockInputParser', () => {
const input = {
model: 'anthropic.claude-sonnet-4-20250514-v1:0',
};
const result = bedrockInputParser.parse(input) as BedrockConverseInput;
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toBe(true);
expect(additionalFields.thinkingBudget).toBe(2000);
@ -110,7 +269,7 @@ describe('bedrockInputParser', () => {
const input = {
model: 'some-other-model',
};
const result = bedrockInputParser.parse(input) as BedrockConverseInput;
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
expect(result.additionalModelRequestFields).toBeUndefined();
});
@ -118,7 +277,7 @@ describe('bedrockInputParser', () => {
const input = {
model: 'moonshot.kimi-k2-0711-thinking',
};
const result = bedrockInputParser.parse(input) as BedrockConverseInput;
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as
| Record<string, unknown>
| undefined;
@ -129,7 +288,7 @@ describe('bedrockInputParser', () => {
const input = {
model: 'deepseek.deepseek-r1',
};
const result = bedrockInputParser.parse(input) as BedrockConverseInput;
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as
| Record<string, unknown>
| undefined;
@ -141,7 +300,7 @@ describe('bedrockInputParser', () => {
model: 'anthropic.claude-sonnet-4',
thinking: false,
};
const result = bedrockInputParser.parse(input) as BedrockConverseInput;
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toBeUndefined();
expect(additionalFields.thinkingBudget).toBeUndefined();
@ -157,10 +316,307 @@ describe('bedrockInputParser', () => {
thinking: true,
thinkingBudget: 3000,
};
const result = bedrockInputParser.parse(input) as BedrockConverseInput;
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toBe(true);
expect(additionalFields.thinkingBudget).toBe(3000);
});
});
describe('Opus 4.6 Adaptive Thinking', () => {
test('should default to adaptive thinking for anthropic.claude-opus-4-6-v1', () => {
const input = {
model: 'anthropic.claude-opus-4-6-v1',
};
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toEqual({ type: 'adaptive' });
expect(additionalFields.thinkingBudget).toBeUndefined();
expect(additionalFields.anthropic_beta).toEqual([
'output-128k-2025-02-19',
'context-1m-2025-08-07',
]);
});
test('should handle cross-region model ID us.anthropic.claude-opus-4-6-v1', () => {
const input = {
model: 'us.anthropic.claude-opus-4-6-v1',
};
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toEqual({ type: 'adaptive' });
expect(additionalFields.thinkingBudget).toBeUndefined();
expect(additionalFields.anthropic_beta).toEqual([
'output-128k-2025-02-19',
'context-1m-2025-08-07',
]);
});
test('should handle cross-region model ID global.anthropic.claude-opus-4-6-v1', () => {
const input = {
model: 'global.anthropic.claude-opus-4-6-v1',
};
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toEqual({ type: 'adaptive' });
expect(additionalFields.anthropic_beta).toEqual([
'output-128k-2025-02-19',
'context-1m-2025-08-07',
]);
});
test('should pass effort parameter via output_config for adaptive models', () => {
const input = {
model: 'anthropic.claude-opus-4-6-v1',
effort: 'medium',
};
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toEqual({ type: 'adaptive' });
expect(additionalFields.output_config).toEqual({ effort: 'medium' });
expect(additionalFields.effort).toBeUndefined();
});
test('should not include output_config when effort is unset (empty string)', () => {
const input = {
model: 'anthropic.claude-opus-4-6-v1',
effort: '',
};
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toEqual({ type: 'adaptive' });
expect(additionalFields.output_config).toBeUndefined();
});
test('should respect thinking=false for adaptive models', () => {
const input = {
model: 'anthropic.claude-opus-4-6-v1',
thinking: false,
};
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toBeUndefined();
expect(additionalFields.thinkingBudget).toBeUndefined();
expect(additionalFields.anthropic_beta).toEqual([
'output-128k-2025-02-19',
'context-1m-2025-08-07',
]);
});
test('should preserve effort when thinking=false for adaptive models', () => {
const input = {
model: 'anthropic.claude-opus-4-6-v1',
thinking: false,
effort: 'high',
};
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toBeUndefined();
expect(additionalFields.thinkingBudget).toBeUndefined();
expect(additionalFields.output_config).toEqual({ effort: 'high' });
expect(additionalFields.effort).toBeUndefined();
});
test('should strip effort for non-adaptive thinking models', () => {
const input = {
model: 'anthropic.claude-sonnet-4-5-20250929-v1:0',
effort: 'high',
};
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toBe(true);
expect(additionalFields.thinkingBudget).toBe(2000);
expect(additionalFields.effort).toBeUndefined();
expect(additionalFields.output_config).toBeUndefined();
});
test('should not include effort for Opus 4.5 (non-adaptive)', () => {
const input = {
model: 'anthropic.claude-opus-4-5-v1:0',
effort: 'low',
};
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toBe(true);
expect(additionalFields.effort).toBeUndefined();
expect(additionalFields.output_config).toBeUndefined();
});
test('should support max effort for Opus 4.6', () => {
const input = {
model: 'anthropic.claude-opus-4-6-v1',
effort: 'max',
};
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const additionalFields = result.additionalModelRequestFields as Record<string, unknown>;
expect(additionalFields.thinking).toEqual({ type: 'adaptive' });
expect(additionalFields.output_config).toEqual({ effort: 'max' });
});
});
describe('bedrockOutputParser with configureThinking', () => {
test('should preserve adaptive thinking config and set default maxTokens', () => {
const parsed = bedrockInputParser.parse({
model: 'anthropic.claude-opus-4-6-v1',
}) as Record<string, unknown>;
const output = bedrockOutputParser(parsed as Record<string, unknown>);
const amrf = output.additionalModelRequestFields as Record<string, unknown>;
expect(amrf.thinking).toEqual({ type: 'adaptive' });
expect(output.maxTokens).toBe(16000);
expect(output.maxOutputTokens).toBeUndefined();
});
test('should respect user-provided maxTokens for adaptive model', () => {
const parsed = bedrockInputParser.parse({
model: 'anthropic.claude-opus-4-6-v1',
maxTokens: 32000,
}) as Record<string, unknown>;
const output = bedrockOutputParser(parsed as Record<string, unknown>);
expect(output.maxTokens).toBe(32000);
});
test('should convert thinking=true to enabled config for non-adaptive models', () => {
const parsed = bedrockInputParser.parse({
model: 'anthropic.claude-sonnet-4-5-20250929-v1:0',
}) as Record<string, unknown>;
const output = bedrockOutputParser(parsed as Record<string, unknown>);
const amrf = output.additionalModelRequestFields as Record<string, unknown>;
expect(amrf.thinking).toEqual({ type: 'enabled', budget_tokens: 2000 });
expect(output.maxTokens).toBe(8192);
});
test('should pass output_config through for adaptive model with effort', () => {
const parsed = bedrockInputParser.parse({
model: 'anthropic.claude-opus-4-6-v1',
effort: 'low',
}) as Record<string, unknown>;
const output = bedrockOutputParser(parsed as Record<string, unknown>);
const amrf = output.additionalModelRequestFields as Record<string, unknown>;
expect(amrf.thinking).toEqual({ type: 'adaptive' });
expect(amrf.output_config).toEqual({ effort: 'low' });
});
test('should use adaptive default maxTokens (16000) over maxOutputTokens for adaptive models', () => {
const parsed = bedrockInputParser.parse({
model: 'anthropic.claude-opus-4-6-v1',
}) as Record<string, unknown>;
parsed.maxOutputTokens = undefined;
(parsed as Record<string, unknown>).maxTokens = undefined;
const output = bedrockOutputParser(parsed as Record<string, unknown>);
expect(output.maxTokens).toBe(16000);
});
test('should use enabled default maxTokens (8192) for non-adaptive thinking models', () => {
const parsed = bedrockInputParser.parse({
model: 'anthropic.claude-sonnet-4-5-20250929-v1:0',
}) as Record<string, unknown>;
parsed.maxOutputTokens = undefined;
(parsed as Record<string, unknown>).maxTokens = undefined;
const output = bedrockOutputParser(parsed as Record<string, unknown>);
expect(output.maxTokens).toBe(8192);
});
test('should use default thinking budget (2000) when no custom budget is set', () => {
const parsed = bedrockInputParser.parse({
model: 'anthropic.claude-3-7-sonnet',
}) as Record<string, unknown>;
const output = bedrockOutputParser(parsed as Record<string, unknown>);
const amrf = output.additionalModelRequestFields as Record<string, unknown>;
const thinking = amrf.thinking as { type: string; budget_tokens: number };
expect(thinking.budget_tokens).toBe(2000);
});
test('should override default thinking budget with custom value', () => {
const parsed = bedrockInputParser.parse({
model: 'anthropic.claude-3-7-sonnet',
thinkingBudget: 5000,
maxTokens: 8192,
}) as Record<string, unknown>;
const output = bedrockOutputParser(parsed as Record<string, unknown>);
const amrf = output.additionalModelRequestFields as Record<string, unknown>;
const thinking = amrf.thinking as { type: string; budget_tokens: number };
expect(thinking.budget_tokens).toBe(5000);
});
test('should remove additionalModelRequestFields when thinking disabled and no other fields', () => {
const parsed = bedrockInputParser.parse({
model: 'some-other-model',
}) as Record<string, unknown>;
const output = bedrockOutputParser(parsed as Record<string, unknown>);
expect(output.additionalModelRequestFields).toBeUndefined();
});
});
describe('Model switching cleanup', () => {
test('should strip anthropic_beta when switching from Anthropic to non-Anthropic model', () => {
const staleConversationData = {
model: 'openai.gpt-oss-120b-1:0',
promptCache: true,
additionalModelRequestFields: {
anthropic_beta: ['output-128k-2025-02-19', 'context-1m-2025-08-07'],
thinking: { type: 'adaptive' },
output_config: { effort: 'high' },
},
};
const result = bedrockInputParser.parse(staleConversationData) as Record<string, unknown>;
const amrf = result.additionalModelRequestFields as Record<string, unknown> | undefined;
expect(amrf?.anthropic_beta).toBeUndefined();
expect(amrf?.thinking).toBeUndefined();
expect(amrf?.output_config).toBeUndefined();
expect(result.promptCache).toBeUndefined();
});
test('should strip promptCache when switching from Claude to non-Claude/Nova model', () => {
const staleConversationData = {
model: 'deepseek.deepseek-r1',
promptCache: true,
};
const result = bedrockInputParser.parse(staleConversationData) as Record<string, unknown>;
expect(result.promptCache).toBeUndefined();
});
test('should preserve promptCache for Claude models', () => {
const input = {
model: 'anthropic.claude-sonnet-4-20250514-v1:0',
promptCache: true,
};
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
expect(result.promptCache).toBe(true);
});
test('should preserve promptCache for Nova models', () => {
const input = {
model: 'amazon.nova-pro-v1:0',
promptCache: true,
};
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
expect(result.promptCache).toBe(true);
});
test('should strip stale thinking config from additionalModelRequestFields for non-Anthropic models', () => {
const staleConversationData = {
model: 'moonshot.kimi-k2-0711-thinking',
additionalModelRequestFields: {
thinking: { type: 'enabled', budget_tokens: 2000 },
thinkingBudget: 2000,
anthropic_beta: ['output-128k-2025-02-19'],
},
};
const result = bedrockInputParser.parse(staleConversationData) as Record<string, unknown>;
const amrf = result.additionalModelRequestFields as Record<string, unknown> | undefined;
expect(amrf?.anthropic_beta).toBeUndefined();
expect(amrf?.thinking).toBeUndefined();
expect(amrf?.thinkingBudget).toBeUndefined();
});
test('should not strip anthropic_beta when staying on an Anthropic model', () => {
const input = {
model: 'anthropic.claude-opus-4-6-v1',
};
const result = bedrockInputParser.parse(input) as Record<string, unknown>;
const amrf = result.additionalModelRequestFields as Record<string, unknown>;
expect(amrf.anthropic_beta).toBeDefined();
expect(Array.isArray(amrf.anthropic_beta)).toBe(true);
});
});
});

View file

@ -1,10 +1,12 @@
import { z } from 'zod';
import * as s from './schemas';
type ThinkingConfig = {
type: 'enabled';
budget_tokens: number;
};
const DEFAULT_ENABLED_MAX_TOKENS = 8192;
const DEFAULT_ADAPTIVE_MAX_TOKENS = 16000;
const DEFAULT_THINKING_BUDGET = 2000;
type ThinkingConfig = { type: 'enabled'; budget_tokens: number } | { type: 'adaptive' };
type AnthropicReasoning = {
thinking?: ThinkingConfig | boolean;
thinkingBudget?: number;
@ -15,6 +17,64 @@ type AnthropicInput = BedrockConverseInput & {
AnthropicReasoning;
};
/** Extracts opus major/minor version from both naming formats */
function parseOpusVersion(model: string): { major: number; minor: number } | null {
const nameFirst = model.match(/claude-opus[-.]?(\d+)(?:[-.](\d+))?/);
if (nameFirst) {
return {
major: parseInt(nameFirst[1], 10),
minor: nameFirst[2] != null ? parseInt(nameFirst[2], 10) : 0,
};
}
const numFirst = model.match(/claude-(\d+)(?:[-.](\d+))?-opus/);
if (numFirst) {
return {
major: parseInt(numFirst[1], 10),
minor: numFirst[2] != null ? parseInt(numFirst[2], 10) : 0,
};
}
return null;
}
/** Extracts sonnet major version from both naming formats */
function parseSonnetVersion(model: string): number | null {
const nameFirst = model.match(/claude-sonnet[-.]?(\d+)/);
if (nameFirst) {
return parseInt(nameFirst[1], 10);
}
const numFirst = model.match(/claude-(\d+)(?:[-.]?\d+)?-sonnet/);
if (numFirst) {
return parseInt(numFirst[1], 10);
}
return null;
}
/** Checks if a model supports adaptive thinking (Opus 4.6+, Sonnet 5+) */
export function supportsAdaptiveThinking(model: string): boolean {
const opus = parseOpusVersion(model);
if (opus && (opus.major > 4 || (opus.major === 4 && opus.minor >= 6))) {
return true;
}
const sonnet = parseSonnetVersion(model);
if (sonnet != null && sonnet >= 5) {
return true;
}
return false;
}
/** Checks if a model qualifies for the context-1m beta header (Sonnet 4+, Opus 4.6+, Opus 5+) */
export function supportsContext1m(model: string): boolean {
const sonnet = parseSonnetVersion(model);
if (sonnet != null && sonnet >= 4) {
return true;
}
const opus = parseOpusVersion(model);
if (opus && (opus.major > 4 || (opus.major === 4 && opus.minor >= 6))) {
return true;
}
return false;
}
/**
* Gets the appropriate anthropic_beta headers for Bedrock Anthropic models.
* Bedrock uses `anthropic_beta` (with underscore) in additionalModelRequestFields.
@ -38,7 +98,7 @@ function getBedrockAnthropicBetaHeaders(model: string): string[] {
betaHeaders.push('output-128k-2025-02-19');
}
if (isSonnet4PlusModel) {
if (isSonnet4PlusModel || supportsAdaptiveThinking(model)) {
betaHeaders.push('context-1m-2025-08-07');
}
@ -67,6 +127,7 @@ export const bedrockInputSchema = s.tConversationSchema
stop: true,
thinking: true,
thinkingBudget: true,
effort: true,
promptCache: true,
/* Catch-all fields */
topK: true,
@ -75,10 +136,11 @@ export const bedrockInputSchema = s.tConversationSchema
.transform((obj) => {
if ((obj as AnthropicInput).additionalModelRequestFields?.thinking != null) {
const _obj = obj as AnthropicInput;
obj.thinking = !!_obj.additionalModelRequestFields.thinking;
const thinking = _obj.additionalModelRequestFields.thinking;
obj.thinking = !!thinking;
obj.thinkingBudget =
typeof _obj.additionalModelRequestFields.thinking === 'object'
? (_obj.additionalModelRequestFields.thinking as ThinkingConfig)?.budget_tokens
typeof thinking === 'object' && 'budget_tokens' in thinking
? thinking.budget_tokens
: undefined;
delete obj.additionalModelRequestFields;
}
@ -109,6 +171,7 @@ export const bedrockInputParser = s.tConversationSchema
stop: true,
thinking: true,
thinkingBudget: true,
effort: true,
promptCache: true,
/* Catch-all fields */
topK: true,
@ -157,34 +220,77 @@ export const bedrockInputParser = s.tConversationSchema
typedData.model,
))
) {
if (additionalFields.thinking === undefined) {
additionalFields.thinking = true;
} else if (additionalFields.thinking === false) {
delete additionalFields.thinking;
delete additionalFields.thinkingBudget;
const isAdaptive = supportsAdaptiveThinking(typedData.model as string);
if (isAdaptive) {
const effort = additionalFields.effort;
if (effort && typeof effort === 'string' && effort !== '') {
additionalFields.output_config = { effort };
}
delete additionalFields.effort;
if (additionalFields.thinking === false) {
delete additionalFields.thinking;
delete additionalFields.thinkingBudget;
} else {
additionalFields.thinking = { type: 'adaptive' };
delete additionalFields.thinkingBudget;
}
} else {
if (additionalFields.thinking === undefined) {
additionalFields.thinking = true;
} else if (additionalFields.thinking === false) {
delete additionalFields.thinking;
delete additionalFields.thinkingBudget;
}
if (additionalFields.thinking === true && additionalFields.thinkingBudget === undefined) {
additionalFields.thinkingBudget = DEFAULT_THINKING_BUDGET;
}
delete additionalFields.effort;
}
if (additionalFields.thinking === true && additionalFields.thinkingBudget === undefined) {
additionalFields.thinkingBudget = 2000;
}
if (typedData.model.includes('anthropic.')) {
const betaHeaders = getBedrockAnthropicBetaHeaders(typedData.model);
if ((typedData.model as string).includes('anthropic.')) {
const betaHeaders = getBedrockAnthropicBetaHeaders(typedData.model as string);
if (betaHeaders.length > 0) {
additionalFields.anthropic_beta = betaHeaders;
}
}
} else if (additionalFields.thinking != null || additionalFields.thinkingBudget != null) {
} else {
delete additionalFields.thinking;
delete additionalFields.thinkingBudget;
delete additionalFields.effort;
delete additionalFields.output_config;
delete additionalFields.anthropic_beta;
}
const isAnthropicModel =
typeof typedData.model === 'string' && typedData.model.includes('anthropic.');
/** Strip stale anthropic_beta from previously-persisted additionalModelRequestFields */
if (
!isAnthropicModel &&
typeof typedData.additionalModelRequestFields === 'object' &&
typedData.additionalModelRequestFields != null
) {
const amrf = typedData.additionalModelRequestFields as Record<string, unknown>;
delete amrf.anthropic_beta;
delete amrf.thinking;
delete amrf.thinkingBudget;
delete amrf.effort;
delete amrf.output_config;
}
/** Default promptCache for claude and nova models, if not defined */
if (
typeof typedData.model === 'string' &&
(typedData.model.includes('claude') || typedData.model.includes('nova')) &&
typedData.promptCache === undefined
(typedData.model.includes('claude') || typedData.model.includes('nova'))
) {
typedData.promptCache = true;
if (typedData.promptCache === undefined) {
typedData.promptCache = true;
}
} else if (typedData.promptCache === true) {
typedData.promptCache = undefined;
}
if (Object.keys(additionalFields).length > 0) {
@ -212,20 +318,34 @@ export const bedrockInputParser = s.tConversationSchema
*/
function configureThinking(data: AnthropicInput): AnthropicInput {
const updatedData = { ...data };
if (updatedData.additionalModelRequestFields?.thinking === true) {
updatedData.maxTokens = updatedData.maxTokens ?? updatedData.maxOutputTokens ?? 8192;
const thinking = updatedData.additionalModelRequestFields?.thinking;
if (thinking === true) {
updatedData.maxTokens =
updatedData.maxTokens ?? updatedData.maxOutputTokens ?? DEFAULT_ENABLED_MAX_TOKENS;
delete updatedData.maxOutputTokens;
const thinkingConfig: AnthropicReasoning['thinking'] = {
const thinkingConfig: ThinkingConfig = {
type: 'enabled',
budget_tokens: updatedData.additionalModelRequestFields.thinkingBudget ?? 2000,
budget_tokens:
updatedData.additionalModelRequestFields?.thinkingBudget ?? DEFAULT_THINKING_BUDGET,
};
if (thinkingConfig.budget_tokens > updatedData.maxTokens) {
thinkingConfig.budget_tokens = Math.floor(updatedData.maxTokens * 0.9);
}
updatedData.additionalModelRequestFields.thinking = thinkingConfig;
delete updatedData.additionalModelRequestFields.thinkingBudget;
updatedData.additionalModelRequestFields!.thinking = thinkingConfig;
delete updatedData.additionalModelRequestFields!.thinkingBudget;
} else if (
typeof thinking === 'object' &&
thinking != null &&
(thinking as { type: string }).type === 'adaptive'
) {
updatedData.maxTokens =
updatedData.maxTokens ?? updatedData.maxOutputTokens ?? DEFAULT_ADAPTIVE_MAX_TOKENS;
delete updatedData.maxOutputTokens;
delete updatedData.additionalModelRequestFields!.thinkingBudget;
}
return updatedData;
}
@ -268,8 +388,8 @@ export const bedrockOutputParser = (data: Record<string, unknown>) => {
}
result = configureThinking(result as AnthropicInput);
// Remove additionalModelRequestFields from the result if it doesn't thinking config
if ((result as AnthropicInput).additionalModelRequestFields?.thinking == null) {
const amrf = result.additionalModelRequestFields as Record<string, unknown> | undefined;
if (!amrf || Object.keys(amrf).length === 0) {
delete result.additionalModelRequestFields;
}

View file

@ -1133,6 +1133,7 @@ const sharedOpenAIModels = [
];
const sharedAnthropicModels = [
'claude-opus-4-6',
'claude-sonnet-4-5',
'claude-sonnet-4-5-20250929',
'claude-haiku-4-5',
@ -1153,6 +1154,7 @@ const sharedAnthropicModels = [
];
export const bedrockModels = [
'anthropic.claude-opus-4-6-v1',
'anthropic.claude-sonnet-4-5-20250929-v1:0',
'anthropic.claude-haiku-4-5-20251001-v1:0',
'anthropic.claude-opus-4-1-20250805-v1:0',

View file

@ -5,6 +5,7 @@ import {
openAISettings,
googleSettings,
ReasoningEffort,
AnthropicEffort,
ReasoningSummary,
BedrockProviders,
anthropicSettings,
@ -445,6 +446,26 @@ const anthropic: Record<string, SettingDefinition> = {
showDefault: false,
columnSpan: 2,
},
effort: {
key: 'effort',
label: 'com_endpoint_effort',
labelCode: true,
description: 'com_endpoint_anthropic_effort',
descriptionCode: true,
type: 'enum',
default: anthropicSettings.effort.default,
component: 'slider',
options: anthropicSettings.effort.options,
enumMappings: {
[AnthropicEffort.unset]: 'com_ui_auto',
[AnthropicEffort.low]: 'com_ui_low',
[AnthropicEffort.medium]: 'com_ui_medium',
[AnthropicEffort.high]: 'com_ui_high',
[AnthropicEffort.max]: 'com_ui_max',
},
optionType: 'model',
columnSpan: 4,
},
};
const bedrock: Record<string, SettingDefinition> = {
@ -734,6 +755,7 @@ const anthropicConfig: SettingsConfiguration = [
anthropic.promptCache,
anthropic.thinking,
anthropic.thinkingBudget,
anthropic.effort,
anthropic.web_search,
librechat.fileTokenLimit,
];
@ -754,6 +776,7 @@ const anthropicCol2: SettingsConfiguration = [
anthropic.promptCache,
anthropic.thinking,
anthropic.thinkingBudget,
anthropic.effort,
anthropic.web_search,
librechat.fileTokenLimit,
];
@ -772,6 +795,7 @@ const bedrockAnthropic: SettingsConfiguration = [
bedrock.promptCache,
anthropic.thinking,
anthropic.thinkingBudget,
anthropic.effort,
librechat.fileTokenLimit,
];
@ -829,6 +853,7 @@ const bedrockAnthropicCol2: SettingsConfiguration = [
bedrock.promptCache,
anthropic.thinking,
anthropic.thinkingBudget,
anthropic.effort,
librechat.fileTokenLimit,
];

View file

@ -74,81 +74,83 @@ describe('anthropicSettings', () => {
});
});
describe('Claude Opus 4.5+ models (64K limit - future-proof)', () => {
describe('Claude Opus 4.5 models (64K limit)', () => {
it('should return 64K for claude-opus-4-5', () => {
expect(reset('claude-opus-4-5')).toBe(64000);
});
it('should return 64K for claude-opus-4-6', () => {
expect(reset('claude-opus-4-6')).toBe(64000);
});
it('should return 64K for claude-opus-4-7', () => {
expect(reset('claude-opus-4-7')).toBe(64000);
});
it('should return 64K for claude-opus-4-8', () => {
expect(reset('claude-opus-4-8')).toBe(64000);
});
it('should return 64K for claude-opus-4-9', () => {
expect(reset('claude-opus-4-9')).toBe(64000);
});
it('should return 64K for claude-opus-4.5', () => {
expect(reset('claude-opus-4.5')).toBe(64000);
});
});
it('should return 64K for claude-opus-4.6', () => {
expect(reset('claude-opus-4.6')).toBe(64000);
describe('Claude Opus 4.6+ models (128K limit - future-proof)', () => {
it('should return 128K for claude-opus-4-6', () => {
expect(reset('claude-opus-4-6')).toBe(128000);
});
it('should return 128K for claude-opus-4.6', () => {
expect(reset('claude-opus-4.6')).toBe(128000);
});
it('should return 128K for claude-opus-4-7', () => {
expect(reset('claude-opus-4-7')).toBe(128000);
});
it('should return 128K for claude-opus-4-8', () => {
expect(reset('claude-opus-4-8')).toBe(128000);
});
it('should return 128K for claude-opus-4-9', () => {
expect(reset('claude-opus-4-9')).toBe(128000);
});
});
describe('Claude Opus 4.10+ models (double-digit minor versions)', () => {
it('should return 64K for claude-opus-4-10', () => {
expect(reset('claude-opus-4-10')).toBe(64000);
it('should return 128K for claude-opus-4-10', () => {
expect(reset('claude-opus-4-10')).toBe(128000);
});
it('should return 64K for claude-opus-4-11', () => {
expect(reset('claude-opus-4-11')).toBe(64000);
it('should return 128K for claude-opus-4-11', () => {
expect(reset('claude-opus-4-11')).toBe(128000);
});
it('should return 64K for claude-opus-4-15', () => {
expect(reset('claude-opus-4-15')).toBe(64000);
it('should return 128K for claude-opus-4-15', () => {
expect(reset('claude-opus-4-15')).toBe(128000);
});
it('should return 64K for claude-opus-4-20', () => {
expect(reset('claude-opus-4-20')).toBe(64000);
it('should return 128K for claude-opus-4-20', () => {
expect(reset('claude-opus-4-20')).toBe(128000);
});
it('should return 64K for claude-opus-4.10', () => {
expect(reset('claude-opus-4.10')).toBe(64000);
it('should return 128K for claude-opus-4.10', () => {
expect(reset('claude-opus-4.10')).toBe(128000);
});
});
describe('Claude Opus 5+ models (future major versions)', () => {
it('should return 64K for claude-opus-5', () => {
expect(reset('claude-opus-5')).toBe(64000);
it('should return 128K for claude-opus-5', () => {
expect(reset('claude-opus-5')).toBe(128000);
});
it('should return 64K for claude-opus-6', () => {
expect(reset('claude-opus-6')).toBe(64000);
it('should return 128K for claude-opus-6', () => {
expect(reset('claude-opus-6')).toBe(128000);
});
it('should return 64K for claude-opus-7', () => {
expect(reset('claude-opus-7')).toBe(64000);
it('should return 128K for claude-opus-7', () => {
expect(reset('claude-opus-7')).toBe(128000);
});
it('should return 64K for claude-opus-9', () => {
expect(reset('claude-opus-9')).toBe(64000);
it('should return 128K for claude-opus-9', () => {
expect(reset('claude-opus-9')).toBe(128000);
});
it('should return 64K for claude-opus-5-0', () => {
expect(reset('claude-opus-5-0')).toBe(64000);
it('should return 128K for claude-opus-5-0', () => {
expect(reset('claude-opus-5-0')).toBe(128000);
});
it('should return 64K for claude-opus-5.0', () => {
expect(reset('claude-opus-5.0')).toBe(64000);
it('should return 128K for claude-opus-5.0', () => {
expect(reset('claude-opus-5.0')).toBe(128000);
});
});
@ -157,8 +159,8 @@ describe('anthropicSettings', () => {
expect(reset('claude-opus-4-5-20250420')).toBe(64000);
});
it('should return 64K for claude-opus-4-6-20260101', () => {
expect(reset('claude-opus-4-6-20260101')).toBe(64000);
it('should return 128K for claude-opus-4-6-20260101', () => {
expect(reset('claude-opus-4-6-20260101')).toBe(128000);
});
it('should return 32K for claude-opus-4-1-20250805', () => {
@ -234,8 +236,8 @@ describe('anthropicSettings', () => {
it('should match claude-opus45 (no separator after opus)', () => {
// The regex allows optional separators, so "45" can follow directly
// In practice, Anthropic uses separators, but regex is permissive
expect(reset('claude-opus45')).toBe(64000);
// "45" is treated as major version 45 (>= 5), so it gets 128K
expect(reset('claude-opus45')).toBe(128000);
});
});
});
@ -278,16 +280,28 @@ describe('anthropicSettings', () => {
expect(set(50000, 'claude-opus-4-5')).toBe(50000);
});
it('should cap at 64K for claude-opus-4-6', () => {
expect(set(80000, 'claude-opus-4-6')).toBe(64000);
it('should allow 80K for claude-opus-4-6 (128K cap)', () => {
expect(set(80000, 'claude-opus-4-6')).toBe(80000);
});
it('should cap at 64K for claude-opus-5', () => {
expect(set(100000, 'claude-opus-5')).toBe(64000);
it('should cap at 128K for claude-opus-4-6', () => {
expect(set(150000, 'claude-opus-4-6')).toBe(128000);
});
it('should cap at 64K for claude-opus-4-10', () => {
expect(set(100000, 'claude-opus-4-10')).toBe(64000);
it('should cap at 128K for claude-opus-5', () => {
expect(set(150000, 'claude-opus-5')).toBe(128000);
});
it('should allow 100K for claude-opus-5', () => {
expect(set(100000, 'claude-opus-5')).toBe(100000);
});
it('should cap at 128K for claude-opus-4-10', () => {
expect(set(150000, 'claude-opus-4-10')).toBe(128000);
});
it('should allow 100K for claude-opus-4-10', () => {
expect(set(100000, 'claude-opus-4-10')).toBe(100000);
});
});

View file

@ -173,6 +173,14 @@ export enum ReasoningEffort {
xhigh = 'xhigh',
}
export enum AnthropicEffort {
unset = '',
low = 'low',
medium = 'medium',
high = 'high',
max = 'max',
}
export enum ReasoningSummary {
none = '',
auto = 'auto',
@ -201,6 +209,7 @@ export const imageDetailValue = {
export const eImageDetailSchema = z.nativeEnum(ImageDetail);
export const eReasoningEffortSchema = z.nativeEnum(ReasoningEffort);
export const eAnthropicEffortSchema = z.nativeEnum(AnthropicEffort);
export const eReasoningSummarySchema = z.nativeEnum(ReasoningSummary);
export const eVerbositySchema = z.nativeEnum(Verbosity);
@ -382,6 +391,10 @@ export const anthropicSettings = {
step: 1 as const,
default: DEFAULT_MAX_OUTPUT,
reset: (modelName: string) => {
if (/claude-opus[-.]?(?:4[-.]?(?:[6-9]|\d{2,})|[5-9]|\d{2,})/.test(modelName)) {
return ANTHROPIC_MAX_OUTPUT;
}
if (/claude-(?:sonnet|haiku)[-.]?[4-9]/.test(modelName)) {
return CLAUDE_4_64K_MAX_OUTPUT;
}
@ -397,6 +410,13 @@ export const anthropicSettings = {
return DEFAULT_MAX_OUTPUT;
},
set: (value: number, modelName: string) => {
if (/claude-opus[-.]?(?:4[-.]?(?:[6-9]|\d{2,})|[5-9]|\d{2,})/.test(modelName)) {
if (value > ANTHROPIC_MAX_OUTPUT) {
return ANTHROPIC_MAX_OUTPUT;
}
return value;
}
if (/claude-(?:sonnet|haiku)[-.]?[4-9]/.test(modelName) && value > CLAUDE_4_64K_MAX_OUTPUT) {
return CLAUDE_4_64K_MAX_OUTPUT;
}
@ -445,6 +465,16 @@ export const anthropicSettings = {
default: LEGACY_ANTHROPIC_MAX_OUTPUT,
},
},
effort: {
default: AnthropicEffort.unset,
options: [
AnthropicEffort.unset,
AnthropicEffort.low,
AnthropicEffort.medium,
AnthropicEffort.high,
AnthropicEffort.max,
],
},
web_search: {
default: false as const,
},
@ -703,6 +733,8 @@ export const tConversationSchema = z.object({
verbosity: eVerbositySchema.optional().nullable(),
/* OpenAI: use Responses API */
useResponsesApi: z.boolean().optional(),
/* Anthropic: Effort control */
effort: eAnthropicEffortSchema.optional().nullable(),
/* OpenAI Responses API / Anthropic API / Google API */
web_search: z.boolean().optional(),
/* disable streaming */
@ -825,6 +857,7 @@ export const tQueryParamsSchema = tConversationSchema
promptCache: true,
thinking: true,
thinkingBudget: true,
effort: true,
/** @endpoints bedrock */
region: true,
/** @endpoints bedrock */
@ -1122,6 +1155,7 @@ export const anthropicBaseSchema = tConversationSchema.pick({
promptCache: true,
thinking: true,
thinkingBudget: true,
effort: true,
artifacts: true,
iconURL: true,
greeting: true,

View file

@ -52,6 +52,7 @@ export type TEndpointOption = Pick<
| 'promptCache'
| 'thinking'
| 'thinkingBudget'
| 'effort'
// Assistant/Agent fields
| 'assistant_id'
| 'agent_id'

View file

@ -83,6 +83,9 @@ export const conversationPreset = {
thinkingBudget: {
type: Number,
},
effort: {
type: String,
},
system: {
type: String,
},

View file

@ -30,6 +30,7 @@ export interface IPreset extends Document {
promptCache?: boolean;
thinking?: boolean;
thinkingBudget?: number;
effort?: string;
system?: string;
resendFiles?: boolean;
imageDetail?: string;

View file

@ -28,6 +28,7 @@ export interface IConversation extends Document {
promptCache?: boolean;
thinking?: boolean;
thinkingBudget?: number;
effort?: string;
system?: string;
resendFiles?: boolean;
imageDetail?: string;