🤖 feat: Claude Opus 4.6 - 1M Context, Premium Pricing, Adaptive Thinking (#11670)

* feat: Implement new features for Claude Opus 4.6 model - Added support for tiered pricing based on input token count for the Claude Opus 4.6 model. - Updated token value calculations to include inputTokenCount for accurate pricing. - Enhanced transaction handling to apply premium rates when input tokens exceed defined thresholds. - Introduced comprehensive tests to validate pricing logic for both standard and premium rates across various scenarios. - Updated related utility functions and models to accommodate new pricing structure. This change improves the flexibility and accuracy of token pricing for the Claude Opus 4.6 model, ensuring users are charged appropriately based on their usage. * feat: Add effort field to conversation and preset schemas - Introduced a new optional `effort` field of type `String` in both the `IPreset` and `IConversation` interfaces. - Updated the `conversationPreset` schema to include the `effort` field, enhancing the data structure for better context management. * chore: Clean up unused variable and comments in initialize function * chore: update dependencies and SDK versions - Updated @anthropic-ai/sdk to version 0.73.0 in package.json and overrides. - Updated @anthropic-ai/vertex-sdk to version 0.14.3 in packages/api/package.json. - Updated @librechat/agents to version 3.1.34 in packages/api/package.json. - Refactored imports in packages/api/src/endpoints/anthropic/vertex.ts for consistency. * chore: remove postcss-loader from dependencies * feat: Bedrock model support for adaptive thinking configuration - Updated .env.example to include new Bedrock model IDs for Claude Opus 4.6. - Refactored bedrockInputParser to support adaptive thinking for Opus models, allowing for dynamic thinking configurations. - Introduced a new function to check model compatibility with adaptive thinking. - Added an optional `effort` field to the input schemas and updated related configurations. - Enhanced tests to validate the new adaptive thinking logic and model configurations. * feat: Add tests for Opus 4.6 adaptive thinking configuration * feat: Update model references for Opus 4.6 by removing version suffix * feat: Update @librechat/agents to version 3.1.35 in package.json and package-lock.json * chore: @librechat/agents to version 3.1.36 in package.json and package-lock.json * feat: Normalize inputTokenCount for spendTokens and enhance transaction handling - Introduced normalization for promptTokens to ensure inputTokenCount does not go negative. - Updated transaction logic to reflect normalized inputTokenCount in pricing calculations. - Added comprehensive tests to validate the new normalization logic and its impact on transaction rates for both standard and premium models. - Refactored related functions to improve clarity and maintainability of token value calculations. * chore: Simplify adaptive thinking configuration in helpers.ts - Removed unnecessary type casting for the thinking property in updatedOptions. - Ensured that adaptive thinking is directly assigned when conditions are met, improving code clarity. * refactor: Replace hard-coded token values with dynamic retrieval from maxTokensMap in model tests * fix: Ensure non-negative token values in spendTokens calculations - Updated token value retrieval to use Math.max for prompt and completion tokens, preventing negative values. - Enhanced clarity in token calculations for both prompt and completion transactions. * test: Add test for normalization of negative structured token values in spendStructuredTokens - Implemented a test to ensure that negative structured token values are normalized to zero during token spending. - Verified that the transaction rates remain consistent with the expected standard values after normalization. * refactor: Bedrock model support for adaptive thinking and context handling - Added tests for various alternate naming conventions of Claude models to validate adaptive thinking and context support. - Refactored `supportsAdaptiveThinking` and `supportsContext1m` functions to utilize new parsing methods for model version extraction. - Updated `bedrockInputParser` to handle effort configurations more effectively and strip unnecessary fields for non-adaptive models. - Improved handling of anthropic model configurations in the input parser. * fix: Improve token value retrieval in getMultiplier function - Updated the token value retrieval logic to use optional chaining for better safety against undefined values. - Added a test case to ensure that the function returns the default rate when the provided valueKey does not exist in tokenValues.
2026-02-08 02:24:24 +01:00 · 2026-02-06 18:35:36 -05:00 · 2026-02-06 18:35:36 -05:00 · 41e2348d47
commit 41e2348d47
parent 1d5f2eb04b
32 changed files with 2902 additions and 1087 deletions
--- a/packages/api/package.json
+++ b/packages/api/package.json
@ -78,7 +78,7 @@
    "registry": "https://registry.npmjs.org/"
  },
  "peerDependencies": {
-    "@anthropic-ai/vertex-sdk": "^0.14.0",
+    "@anthropic-ai/vertex-sdk": "^0.14.3",
    "@aws-sdk/client-bedrock-runtime": "^3.970.0",
    "@aws-sdk/client-s3": "^3.980.0",
    "@azure/identity": "^4.7.0",
@ -87,7 +87,7 @@
    "@google/genai": "^1.19.0",
    "@keyv/redis": "^4.3.3",
    "@langchain/core": "^0.3.80",
-    "@librechat/agents": "^3.1.33",
+    "@librechat/agents": "^3.1.36",
    "@librechat/data-schemas": "*",
    "@modelcontextprotocol/sdk": "^1.26.0",
    "@smithy/node-http-handler": "^4.4.5",
--- a/packages/api/src/endpoints/anthropic/helpers.ts
+++ b/packages/api/src/endpoints/anthropic/helpers.ts
@ -1,6 +1,12 @@
 import { logger } from '@librechat/data-schemas';
 import { AnthropicClientOptions } from '@librechat/agents';
-import { EModelEndpoint, anthropicSettings } from 'librechat-data-provider';
+import {
+  EModelEndpoint,
+  AnthropicEffort,
+  anthropicSettings,
+  supportsContext1m,
+  supportsAdaptiveThinking,
+} from 'librechat-data-provider';
 import { matchModelName } from '~/utils/tokens';

 /**
@ -48,7 +54,7 @@ function getClaudeHeaders(
    return {
      'anthropic-beta': 'token-efficient-tools-2025-02-19,output-128k-2025-02-19',
    };
-  } else if (/claude-sonnet-4/.test(model)) {
+  } else if (supportsContext1m(model)) {
    return {
      'anthropic-beta': 'context-1m-2025-08-07',
    };
@ -58,25 +64,43 @@ function getClaudeHeaders(
 }

 /**
- * Configures reasoning-related options for Claude models
- * @param {AnthropicClientOptions & { max_tokens?: number }} anthropicInput The request options object
- * @param {Object} extendedOptions Additional client configuration options
- * @param {boolean} extendedOptions.thinking Whether thinking is enabled in client config
- * @param {number|null} extendedOptions.thinkingBudget The token budget for thinking
- * @returns {Object} Updated request options
+ * Configures reasoning-related options for Claude models.
+ * Models supporting adaptive thinking (Opus 4.6+, Sonnet 5+) use effort control instead of manual budget_tokens.
 */
 function configureReasoning(
  anthropicInput: AnthropicClientOptions & { max_tokens?: number },
-  extendedOptions: { thinking?: boolean; thinkingBudget?: number | null } = {},
+  extendedOptions: {
+    thinking?: boolean;
+    thinkingBudget?: number | null;
+    effort?: AnthropicEffort | string | null;
+  } = {},
 ): AnthropicClientOptions & { max_tokens?: number } {
  const updatedOptions = { ...anthropicInput };
  const currentMaxTokens = updatedOptions.max_tokens ?? updatedOptions.maxTokens;
+  const modelName = updatedOptions.model ?? '';
+
+  if (extendedOptions.thinking && modelName && supportsAdaptiveThinking(modelName)) {
+    updatedOptions.thinking = { type: 'adaptive' };
+
+    const effort = extendedOptions.effort;
+    if (effort && effort !== AnthropicEffort.unset) {
+      updatedOptions.invocationKwargs = {
+        ...updatedOptions.invocationKwargs,
+        output_config: { effort },
+      };
+    }
+
+    if (currentMaxTokens == null) {
+      updatedOptions.max_tokens = anthropicSettings.maxOutputTokens.reset(modelName);
+    }
+
+    return updatedOptions;
+  }

  if (
    extendedOptions.thinking &&
-    updatedOptions?.model &&
-    (/claude-3[-.]7/.test(updatedOptions.model) ||
-      /claude-(?:sonnet|opus|haiku)-[4-9]/.test(updatedOptions.model))
+    modelName &&
+    (/claude-3[-.]7/.test(modelName) || /claude-(?:sonnet|opus|haiku)-[4-9]/.test(modelName))
  ) {
    updatedOptions.thinking = {
      ...updatedOptions.thinking,
@ -100,7 +124,7 @@ function configureReasoning(
    updatedOptions.thinking.type === 'enabled' &&
    (currentMaxTokens == null || updatedOptions.thinking.budget_tokens > currentMaxTokens)
  ) {
-    const maxTokens = anthropicSettings.maxOutputTokens.reset(updatedOptions.model ?? '');
+    const maxTokens = anthropicSettings.maxOutputTokens.reset(modelName);
    updatedOptions.max_tokens = currentMaxTokens ?? maxTokens;

    logger.warn(
@ -111,11 +135,11 @@ function configureReasoning(

    updatedOptions.thinking.budget_tokens = Math.min(
      updatedOptions.thinking.budget_tokens,
-      Math.floor(updatedOptions.max_tokens * 0.9),
+      Math.floor((updatedOptions.max_tokens ?? 0) * 0.9),
    );
  }

  return updatedOptions;
 }

-export { checkPromptCacheSupport, getClaudeHeaders, configureReasoning };
+export { checkPromptCacheSupport, getClaudeHeaders, configureReasoning, supportsAdaptiveThinking };
--- a/packages/api/src/endpoints/anthropic/initialize.ts
+++ b/packages/api/src/endpoints/anthropic/initialize.ts
@ -77,13 +77,11 @@ export async function initializeAnthropic({
    ...(vertexConfig && { vertexConfig }),
  };

-  /** @type {undefined | TBaseEndpoint} */
  const anthropicConfig = appConfig?.endpoints?.[EModelEndpoint.anthropic];
  const allConfig = appConfig?.endpoints?.all;

  const result = getLLMConfig(credentials, clientOptions);

-  // Apply stream rate delay
  if (anthropicConfig?.streamRate) {
    (result.llmConfig as Record<string, unknown>)._lc_stream_delay = anthropicConfig.streamRate;
  }
--- a/packages/api/src/endpoints/anthropic/llm.spec.ts
+++ b/packages/api/src/endpoints/anthropic/llm.spec.ts
@ -1,5 +1,6 @@
-import { getLLMConfig } from './llm';
+import { AnthropicEffort } from 'librechat-data-provider';
 import type * as t from '~/types';
+import { getLLMConfig } from './llm';

 jest.mock('https-proxy-agent', () => ({
  HttpsProxyAgent: jest.fn().mockImplementation((proxy) => ({ proxy })),
@ -835,13 +836,19 @@ describe('getLLMConfig', () => {
          expect(result.llmConfig.maxTokens).toBe(32000);
        });

-        // opus-4-5+ get 64K
-        const opus64kModels = ['claude-opus-4-5', 'claude-opus-4-7', 'claude-opus-4-10'];
-        opus64kModels.forEach((model) => {
+        // opus-4-5 gets 64K
+        const opus64kResult = getLLMConfig('test-key', {
+          modelOptions: { model: 'claude-opus-4-5' },
+        });
+        expect(opus64kResult.llmConfig.maxTokens).toBe(64000);
+
+        // opus-4-6+ get 128K
+        const opus128kModels = ['claude-opus-4-7', 'claude-opus-4-10'];
+        opus128kModels.forEach((model) => {
          const result = getLLMConfig('test-key', {
            modelOptions: { model },
          });
-          expect(result.llmConfig.maxTokens).toBe(64000);
+          expect(result.llmConfig.maxTokens).toBe(128000);
        });
      });

@ -910,6 +917,126 @@ describe('getLLMConfig', () => {
        expect(result.llmConfig.maxTokens).toBe(32000);
      });

+      it('should use adaptive thinking for Opus 4.6 instead of enabled + budget_tokens', () => {
+        const result = getLLMConfig('test-key', {
+          modelOptions: {
+            model: 'claude-opus-4-6',
+            thinking: true,
+            thinkingBudget: 10000,
+          },
+        });
+
+        expect((result.llmConfig.thinking as unknown as { type: string }).type).toBe('adaptive');
+        expect(result.llmConfig.thinking).not.toHaveProperty('budget_tokens');
+        expect(result.llmConfig.maxTokens).toBe(128000);
+      });
+
+      it('should set effort via output_config for adaptive thinking models', () => {
+        const result = getLLMConfig('test-key', {
+          modelOptions: {
+            model: 'claude-opus-4-6',
+            thinking: true,
+            effort: AnthropicEffort.medium,
+          },
+        });
+
+        expect((result.llmConfig.thinking as unknown as { type: string }).type).toBe('adaptive');
+        expect(result.llmConfig.invocationKwargs).toHaveProperty('output_config');
+        expect(result.llmConfig.invocationKwargs?.output_config).toEqual({
+          effort: AnthropicEffort.medium,
+        });
+      });
+
+      it('should set effort via output_config even without thinking for adaptive models', () => {
+        const result = getLLMConfig('test-key', {
+          modelOptions: {
+            model: 'claude-opus-4-6',
+            thinking: false,
+            effort: AnthropicEffort.low,
+          },
+        });
+
+        expect(result.llmConfig.thinking).toBeUndefined();
+        expect(result.llmConfig.invocationKwargs).toHaveProperty('output_config');
+        expect(result.llmConfig.invocationKwargs?.output_config).toEqual({
+          effort: AnthropicEffort.low,
+        });
+      });
+
+      it('should NOT set adaptive thinking or effort for non-adaptive models', () => {
+        const nonAdaptiveModels = [
+          'claude-opus-4-5',
+          'claude-opus-4-1',
+          'claude-sonnet-4-5',
+          'claude-sonnet-4',
+          'claude-haiku-4-5',
+        ];
+
+        nonAdaptiveModels.forEach((model) => {
+          const result = getLLMConfig('test-key', {
+            modelOptions: {
+              model,
+              thinking: true,
+              thinkingBudget: 10000,
+              effort: AnthropicEffort.medium,
+            },
+          });
+
+          if (result.llmConfig.thinking != null) {
+            expect((result.llmConfig.thinking as unknown as { type: string }).type).not.toBe(
+              'adaptive',
+            );
+          }
+          expect(result.llmConfig.invocationKwargs?.output_config).toBeUndefined();
+        });
+      });
+
+      it('should strip adaptive thinking if it somehow reaches a non-adaptive model', () => {
+        const result = getLLMConfig('test-key', {
+          modelOptions: {
+            model: 'claude-sonnet-4-5',
+            thinking: true,
+            thinkingBudget: 5000,
+          },
+        });
+
+        expect(result.llmConfig.thinking).toMatchObject({
+          type: 'enabled',
+          budget_tokens: 5000,
+        });
+        expect(result.llmConfig.invocationKwargs?.output_config).toBeUndefined();
+      });
+
+      it('should exclude topP/topK for Opus 4.6 with adaptive thinking', () => {
+        const result = getLLMConfig('test-key', {
+          modelOptions: {
+            model: 'claude-opus-4-6',
+            thinking: true,
+            topP: 0.9,
+            topK: 40,
+          },
+        });
+
+        expect((result.llmConfig.thinking as unknown as { type: string }).type).toBe('adaptive');
+        expect(result.llmConfig).not.toHaveProperty('topP');
+        expect(result.llmConfig).not.toHaveProperty('topK');
+      });
+
+      it('should include topP/topK for Opus 4.6 when thinking is disabled', () => {
+        const result = getLLMConfig('test-key', {
+          modelOptions: {
+            model: 'claude-opus-4-6',
+            thinking: false,
+            topP: 0.9,
+            topK: 40,
+          },
+        });
+
+        expect(result.llmConfig.thinking).toBeUndefined();
+        expect(result.llmConfig).toHaveProperty('topP', 0.9);
+        expect(result.llmConfig).toHaveProperty('topK', 40);
+      });
+
      it('should respect model-specific maxOutputTokens for Claude 4.x models', () => {
        const testCases = [
          { model: 'claude-sonnet-4-5', maxOutputTokens: 50000, expected: 50000 },
@ -960,7 +1087,7 @@ describe('getLLMConfig', () => {
        });
      });

-      it('should future-proof Claude 5.x Opus models with 64K default', () => {
+      it('should future-proof Claude 5.x Opus models with 128K default', () => {
        const testCases = [
          'claude-opus-5',
          'claude-opus-5-0',
@ -972,28 +1099,28 @@ describe('getLLMConfig', () => {
          const result = getLLMConfig('test-key', {
            modelOptions: { model },
          });
-          expect(result.llmConfig.maxTokens).toBe(64000);
+          expect(result.llmConfig.maxTokens).toBe(128000);
        });
      });

      it('should future-proof Claude 6-9.x models with correct defaults', () => {
        const testCases = [
-          // Claude 6.x - All get 64K since they're version 5+
+          // Claude 6.x - Sonnet/Haiku get 64K, Opus gets 128K
          { model: 'claude-sonnet-6', expected: 64000 },
          { model: 'claude-haiku-6-0', expected: 64000 },
-          { model: 'claude-opus-6-1', expected: 64000 }, // opus 6+ gets 64K
+          { model: 'claude-opus-6-1', expected: 128000 },
          // Claude 7.x
          { model: 'claude-sonnet-7-20270101', expected: 64000 },
          { model: 'claude-haiku-7.5', expected: 64000 },
-          { model: 'claude-opus-7', expected: 64000 }, // opus 7+ gets 64K
+          { model: 'claude-opus-7', expected: 128000 },
          // Claude 8.x
          { model: 'claude-sonnet-8', expected: 64000 },
          { model: 'claude-haiku-8-2', expected: 64000 },
-          { model: 'claude-opus-8-latest', expected: 64000 }, // opus 8+ gets 64K
+          { model: 'claude-opus-8-latest', expected: 128000 },
          // Claude 9.x
          { model: 'claude-sonnet-9', expected: 64000 },
          { model: 'claude-haiku-9', expected: 64000 },
-          { model: 'claude-opus-9', expected: 64000 }, // opus 9+ gets 64K
+          { model: 'claude-opus-9', expected: 128000 },
        ];

        testCases.forEach(({ model, expected }) => {
--- a/packages/api/src/endpoints/anthropic/llm.ts
+++ b/packages/api/src/endpoints/anthropic/llm.ts
@ -7,7 +7,12 @@ import type {
  AnthropicConfigOptions,
  AnthropicCredentials,
 } from '~/types/anthropic';
-import { checkPromptCacheSupport, getClaudeHeaders, configureReasoning } from './helpers';
+import {
+  supportsAdaptiveThinking,
+  checkPromptCacheSupport,
+  configureReasoning,
+  getClaudeHeaders,
+} from './helpers';
 import {
  createAnthropicVertexClient,
  isAnthropicVertexCredentials,
@ -83,15 +88,14 @@ function getLLMConfig(
    promptCache: options.modelOptions?.promptCache ?? anthropicSettings.promptCache.default,
    thinkingBudget:
      options.modelOptions?.thinkingBudget ?? anthropicSettings.thinkingBudget.default,
+    effort: options.modelOptions?.effort ?? anthropicSettings.effort.default,
  };

-  /** Couldn't figure out a way to still loop through the object while deleting the overlapping keys when porting this
-   * over from javascript, so for now they are being deleted manually until a better way presents itself.
-   */
  if (options.modelOptions) {
    delete options.modelOptions.thinking;
    delete options.modelOptions.promptCache;
    delete options.modelOptions.thinkingBudget;
+    delete options.modelOptions.effort;
  } else {
    throw new Error('No modelOptions provided');
  }
@ -145,10 +149,33 @@ function getLLMConfig(

  requestOptions = configureReasoning(requestOptions, systemOptions);

-  if (!/claude-3[-.]7/.test(mergedOptions.model)) {
-    requestOptions.topP = mergedOptions.topP;
-    requestOptions.topK = mergedOptions.topK;
-  } else if (requestOptions.thinking == null) {
+  if (supportsAdaptiveThinking(mergedOptions.model)) {
+    if (
+      systemOptions.effort &&
+      (systemOptions.effort as string) !== '' &&
+      !requestOptions.invocationKwargs?.output_config
+    ) {
+      requestOptions.invocationKwargs = {
+        ...requestOptions.invocationKwargs,
+        output_config: { effort: systemOptions.effort },
+      };
+    }
+  } else {
+    if (
+      requestOptions.thinking != null &&
+      (requestOptions.thinking as unknown as { type: string }).type === 'adaptive'
+    ) {
+      delete requestOptions.thinking;
+    }
+    if (requestOptions.invocationKwargs?.output_config) {
+      delete requestOptions.invocationKwargs.output_config;
+    }
+  }
+
+  const hasActiveThinking = requestOptions.thinking != null;
+  const isThinkingModel =
+    /claude-3[-.]7/.test(mergedOptions.model) || supportsAdaptiveThinking(mergedOptions.model);
+  if (!isThinkingModel || !hasActiveThinking) {
    requestOptions.topP = mergedOptions.topP;
    requestOptions.topK = mergedOptions.topK;
  }
--- a/packages/api/src/endpoints/anthropic/vertex.ts
+++ b/packages/api/src/endpoints/anthropic/vertex.ts
@ -1,10 +1,10 @@
 import path from 'path';
-import { AnthropicVertex } from '@anthropic-ai/vertex-sdk';
 import { GoogleAuth } from 'google-auth-library';
-import { ClientOptions } from '@anthropic-ai/sdk';
 import { AuthKeys } from 'librechat-data-provider';
-import { loadServiceKey } from '~/utils/key';
+import { AnthropicVertex } from '@anthropic-ai/vertex-sdk';
+import type { ClientOptions } from '@anthropic-ai/sdk';
 import type { AnthropicCredentials, VertexAIClientOptions } from '~/types/anthropic';
+import { loadServiceKey } from '~/utils/key';

 /**
 * Options for loading Vertex AI credentials
--- a/packages/api/src/endpoints/bedrock/initialize.spec.ts
+++ b/packages/api/src/endpoints/bedrock/initialize.spec.ts
@ -613,4 +613,113 @@ describe('initializeBedrock', () => {
      expect(result.llmConfig).toHaveProperty('applicationInferenceProfile', inferenceProfileArn);
    });
  });
+
+  describe('Opus 4.6 Adaptive Thinking', () => {
+    it('should configure adaptive thinking with default maxTokens for Opus 4.6', async () => {
+      const params = createMockParams({
+        model_parameters: {
+          model: 'anthropic.claude-opus-4-6-v1',
+        },
+      });
+
+      const result = (await initializeBedrock(params)) as BedrockLLMConfigResult;
+      const amrf = result.llmConfig.additionalModelRequestFields as Record<string, unknown>;
+
+      expect(amrf.thinking).toEqual({ type: 'adaptive' });
+      expect(result.llmConfig.maxTokens).toBe(16000);
+      expect(amrf.anthropic_beta).toEqual(
+        expect.arrayContaining(['output-128k-2025-02-19', 'context-1m-2025-08-07']),
+      );
+    });
+
+    it('should pass effort via output_config for Opus 4.6', async () => {
+      const params = createMockParams({
+        model_parameters: {
+          model: 'anthropic.claude-opus-4-6-v1',
+          effort: 'medium',
+        },
+      });
+
+      const result = (await initializeBedrock(params)) as BedrockLLMConfigResult;
+      const amrf = result.llmConfig.additionalModelRequestFields as Record<string, unknown>;
+
+      expect(amrf.thinking).toEqual({ type: 'adaptive' });
+      expect(amrf.output_config).toEqual({ effort: 'medium' });
+    });
+
+    it('should respect user-provided maxTokens for Opus 4.6', async () => {
+      const params = createMockParams({
+        model_parameters: {
+          model: 'anthropic.claude-opus-4-6-v1',
+          maxTokens: 32000,
+        },
+      });
+
+      const result = (await initializeBedrock(params)) as BedrockLLMConfigResult;
+
+      expect(result.llmConfig.maxTokens).toBe(32000);
+    });
+
+    it('should handle cross-region Opus 4.6 model IDs', async () => {
+      const params = createMockParams({
+        model_parameters: {
+          model: 'us.anthropic.claude-opus-4-6-v1',
+          effort: 'low',
+        },
+      });
+
+      const result = (await initializeBedrock(params)) as BedrockLLMConfigResult;
+      const amrf = result.llmConfig.additionalModelRequestFields as Record<string, unknown>;
+
+      expect(result.llmConfig).toHaveProperty('model', 'us.anthropic.claude-opus-4-6-v1');
+      expect(amrf.thinking).toEqual({ type: 'adaptive' });
+      expect(amrf.output_config).toEqual({ effort: 'low' });
+    });
+
+    it('should use enabled thinking for non-adaptive models (Sonnet 4.5)', async () => {
+      const params = createMockParams({
+        model_parameters: {
+          model: 'anthropic.claude-sonnet-4-5-20250929-v1:0',
+        },
+      });
+
+      const result = (await initializeBedrock(params)) as BedrockLLMConfigResult;
+      const amrf = result.llmConfig.additionalModelRequestFields as Record<string, unknown>;
+
+      expect(amrf.thinking).toEqual({ type: 'enabled', budget_tokens: 2000 });
+      expect(amrf.output_config).toBeUndefined();
+      expect(result.llmConfig.maxTokens).toBe(8192);
+    });
+
+    it('should not include output_config when effort is empty', async () => {
+      const params = createMockParams({
+        model_parameters: {
+          model: 'anthropic.claude-opus-4-6-v1',
+          effort: '',
+        },
+      });
+
+      const result = (await initializeBedrock(params)) as BedrockLLMConfigResult;
+      const amrf = result.llmConfig.additionalModelRequestFields as Record<string, unknown>;
+
+      expect(amrf.thinking).toEqual({ type: 'adaptive' });
+      expect(amrf.output_config).toBeUndefined();
+    });
+
+    it('should strip effort for non-adaptive models', async () => {
+      const params = createMockParams({
+        model_parameters: {
+          model: 'anthropic.claude-opus-4-1-20250805-v1:0',
+          effort: 'high',
+        },
+      });
+
+      const result = (await initializeBedrock(params)) as BedrockLLMConfigResult;
+      const amrf = result.llmConfig.additionalModelRequestFields as Record<string, unknown>;
+
+      expect(amrf.thinking).toEqual({ type: 'enabled', budget_tokens: 2000 });
+      expect(amrf.output_config).toBeUndefined();
+      expect(amrf.effort).toBeUndefined();
+    });
+  });
 });
--- a/packages/api/src/types/anthropic.ts
+++ b/packages/api/src/types/anthropic.ts
@ -44,6 +44,10 @@ export interface ThinkingConfigEnabled {
  type: 'enabled';
 }

+export interface ThinkingConfigAdaptive {
+  type: 'adaptive';
+}
+
 /**
 * Configuration for enabling Claude's extended thinking.
 *
@ -55,7 +59,10 @@ export interface ThinkingConfigEnabled {
 * [extended thinking](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking)
 * for details.
 */
-export type ThinkingConfigParam = ThinkingConfigEnabled | ThinkingConfigDisabled;
+export type ThinkingConfigParam =
+  | ThinkingConfigEnabled
+  | ThinkingConfigDisabled
+  | ThinkingConfigAdaptive;

 export type AnthropicModelOptions = Partial<Omit<AnthropicParameters, 'thinking'>> & {
  thinking?: AnthropicParameters['thinking'] | null;
--- a/packages/api/src/utils/tokens.ts
+++ b/packages/api/src/utils/tokens.ts
@ -151,6 +151,7 @@ const anthropicModels = {
  'claude-4': 200000,
  'claude-opus-4': 200000,
  'claude-opus-4-5': 200000,
+  'claude-opus-4-6': 1000000,
 };

 const deepseekModels = {
@ -394,6 +395,7 @@ const anthropicMaxOutputs = {
  'claude-sonnet-4': 64000,
  'claude-opus-4': 32000,
  'claude-opus-4-5': 64000,
+  'claude-opus-4-6': 128000,
  'claude-3.5-sonnet': 8192,
  'claude-3-5-sonnet': 8192,
  'claude-3.7-sonnet': 128000,