🤖 feat: Anthropic Vertex AI Support (#10780)

* feat: Add Anthropic Vertex AI Support

* Remove changes from the unused AnthropicClient class

* Add @anthropic-ai/vertex-sdk as peerDependency to packages/api

* Clean up Vertex AI credentials handling

* feat: websearch header

* feat: add prompt caching support for Anthropic Vertex AI

- Support both OpenAI format (input_token_details) and Anthropic format (cache_*_input_tokens) for token usage tracking

- Filter out unsupported anthropic-beta header values for Vertex AI (prompt-caching, max-tokens, output-128k, token-efficient-tools, context-1m)

*  feat: Add Vertex AI support for Anthropic models

- Introduced configuration options for running Anthropic models via Google Cloud Vertex AI in the YAML file.
- Updated ModelService to prioritize Vertex AI models from the configuration.
- Enhanced endpoint configuration to enable Anthropic endpoint when Vertex AI is configured.
- Implemented validation and processing for Vertex AI credentials and options.
- Added new types and schemas for Vertex AI configuration in the data provider.
- Created utility functions for loading and validating Vertex AI credentials and configurations.
- Updated various services to integrate Vertex AI options into the Anthropic client setup.

* 🔒 fix: Improve error handling for missing credentials in LLM configuration

- Updated the `getLLMConfig` function to throw a specific error message when credentials are missing, enhancing clarity for users.
- Refactored the `parseCredentials` function to handle plain API key strings more gracefully, returning them wrapped in an object if JSON parsing fails.

* 🔧 refactor: Clean up code formatting and improve readability

- Updated the `setOptions` method in `AgentClient` to use a parameter name for clarity.
- Refactored error handling in `loadDefaultModels` for better readability.
- Removed unnecessary blank lines in `initialize.js`, `endpoints.ts`, and `vertex.ts` to streamline the code.
- Enhanced formatting in `validateVertexConfig` for improved consistency and clarity.

* 🔧 refactor: Enhance Vertex AI Model Configuration and Integration

- Updated the YAML configuration to support visible model names and deployment mappings for Vertex AI.
- Refactored the `loadDefaultModels` function to utilize the new model name structure.
- Improved the `initializeClient` function to pass full Vertex AI configuration, including model mappings.
- Added utility functions to map visible model names to deployment names, enhancing the integration of Vertex AI models.
- Updated various services and types to accommodate the new model configuration schema and improve overall clarity and functionality.

* 🔧 chore: Update @anthropic-ai/sdk dependency to version 0.71.0 in package.json and package-lock.json

* refactor: Change clientOptions declaration from let to const in initialize.ts for better code clarity

* chore: repository cleanup

* 🌊 feat: Resumable LLM Streams with Horizontal Scaling (#10926)

*  feat: Implement Resumable Generation Jobs with SSE Support

- Introduced GenerationJobManager to handle resumable LLM generation jobs independently of HTTP connections.
- Added support for subscribing to ongoing generation jobs via SSE, allowing clients to reconnect and receive updates without losing progress.
- Enhanced existing agent controllers and routes to integrate resumable functionality, including job creation, completion, and error handling.
- Updated client-side hooks to manage adaptive SSE streams, switching between standard and resumable modes based on user settings.
- Added UI components and settings for enabling/disabling resumable streams, improving user experience during unstable connections.

* WIP: resuming

* WIP: resumable stream

* feat: Enhance Stream Management with Abort Functionality

- Updated the abort endpoint to support aborting ongoing generation streams using either streamId or conversationId.
- Introduced a new mutation hook `useAbortStreamMutation` for client-side integration.
- Added `useStreamStatus` query to monitor stream status and facilitate resuming conversations.
- Enhanced `useChatHelpers` to incorporate abort functionality when stopping generation.
- Improved `useResumableSSE` to handle stream errors and token refresh seamlessly.
- Updated `useResumeOnLoad` to check for active streams and resume conversations appropriately.

* fix: Update query parameter handling in useChatHelpers

- Refactored the logic for determining the query parameter used in fetching messages to prioritize paramId from the URL, falling back to conversationId only if paramId is not available. This change ensures consistency with the ChatView component's expectations.

* fix: improve syncing when switching conversations

* fix: Prevent memory leaks in useResumableSSE by clearing handler maps on stream completion and cleanup

* fix: Improve content type mismatch handling in useStepHandler

- Enhanced the condition for detecting content type mismatches to include additional checks, ensuring more robust validation of content types before processing updates.

* fix: Allow dynamic content creation in useChatFunctions

- Updated the initial response handling to avoid pre-initializing content types, enabling dynamic creation of content parts based on incoming delta events. This change supports various content types such as think and text.

* fix: Refine response message handling in useStepHandler

- Updated logic to determine the appropriate response message based on the last message's origin, ensuring correct message replacement or appending based on user interaction. This change enhances the accuracy of message updates in the chat flow.

* refactor: Enhance GenerationJobManager with In-Memory Implementations

- Introduced InMemoryJobStore, InMemoryEventTransport, and InMemoryContentState for improved job management and event handling.
- Updated GenerationJobManager to utilize these new implementations, allowing for better separation of concerns and easier maintenance.
- Enhanced job metadata handling to support user messages and response IDs for resumable functionality.
- Improved cleanup and state management processes to prevent memory leaks and ensure efficient resource usage.

* refactor: Enhance GenerationJobManager with improved subscriber handling

- Updated RuntimeJobState to include allSubscribersLeftHandlers for managing client disconnections without affecting subscriber count.
- Refined createJob and subscribe methods to ensure generation starts only when the first real client connects.
- Added detailed documentation for methods and properties to clarify the synchronization of job generation with client readiness.
- Improved logging for subscriber checks and event handling to facilitate debugging and monitoring.

* chore: Adjust timeout for subscriber readiness in ResumableAgentController

- Reduced the timeout duration from 5000ms to 2500ms in the startGeneration function to improve responsiveness when waiting for subscriber readiness. This change aims to enhance the efficiency of the agent's background generation process.

* refactor: Update GenerationJobManager documentation and structure

- Enhanced the documentation for GenerationJobManager to clarify the architecture and pluggable service design.
- Updated comments to reflect the potential for Redis integration and the need for async refactoring.
- Improved the structure of the GenerationJob facade to emphasize the unified API while allowing for implementation swapping without affecting consumer code.

* refactor: Convert GenerationJobManager methods to async for improved performance

- Updated methods in GenerationJobManager and InMemoryJobStore to be asynchronous, enhancing the handling of job creation, retrieval, and management.
- Adjusted the ResumableAgentController and related routes to await job operations, ensuring proper flow and error handling.
- Increased timeout duration in ResumableAgentController's startGeneration function to 3500ms for better subscriber readiness management.

* refactor: Simplify initial response handling in useChatFunctions

- Removed unnecessary pre-initialization of content types in the initial response, allowing for dynamic content creation based on incoming delta events. This change enhances flexibility in handling various content types in the chat flow.

* refactor: Clarify content handling logic in useStepHandler

- Updated comments to better explain the handling of initialContent and existingContent in edit and resume scenarios.
- Simplified the logic for merging content, ensuring that initialContent is used directly when available, improving clarity and maintainability.

* refactor: Improve message handling logic in useStepHandler

- Enhanced the logic for managing messages in multi-tab scenarios, ensuring that the most up-to-date message history is utilized.
- Removed existing response placeholders and ensured user messages are included, improving the accuracy of message updates in the chat flow.

* fix: remove unnecessary content length logging in the chat stream response, simplifying the debug message while retaining essential information about run steps. This change enhances clarity in logging without losing critical context.

* refactor: Integrate streamId handling for improved resumable functionality for attachments

- Added streamId parameter to various functions to support resumable mode in tool loading and memory processing.
- Updated related methods to ensure proper handling of attachments and responses based on the presence of streamId, enhancing the overall streaming experience.
- Improved logging and attachment management to accommodate both standard and resumable modes.

* refactor: Streamline abort handling and integrate GenerationJobManager for improved job management

- Removed the abortControllers middleware and integrated abort handling directly into GenerationJobManager.
- Updated abortMessage function to utilize GenerationJobManager for aborting jobs by conversation ID, enhancing clarity and efficiency.
- Simplified cleanup processes and improved error handling during abort operations.
- Enhanced metadata management for jobs, including endpoint and model information, to facilitate better tracking and resource management.

* refactor: Unify streamId and conversationId handling for improved job management

- Updated ResumableAgentController and AgentController to generate conversationId upfront, ensuring it matches streamId for consistency.
- Simplified job creation and metadata management by removing redundant conversationId updates from callbacks.
- Refactored abortMiddleware and related methods to utilize the unified streamId/conversationId approach, enhancing clarity in job handling.
- Removed deprecated methods from GenerationJobManager and InMemoryJobStore, streamlining the codebase and improving maintainability.

* refactor: Enhance resumable SSE handling with improved UI state management and error recovery

- Added UI state restoration on successful SSE connection to indicate ongoing submission.
- Implemented detailed error handling for network failures, including retry logic with exponential backoff.
- Introduced abort event handling to reset UI state on intentional stream closure.
- Enhanced debugging capabilities for testing reconnection and clean close scenarios.
- Updated generation function to retry on network errors, improving resilience during submission processes.

* refactor: Consolidate content state management into IJobStore for improved job handling

- Removed InMemoryContentState and integrated its functionality into InMemoryJobStore, streamlining content state management.
- Updated GenerationJobManager to utilize jobStore for content state operations, enhancing clarity and reducing redundancy.
- Introduced RedisJobStore for horizontal scaling, allowing for efficient job management and content reconstruction from chunks.
- Updated IJobStore interface to reflect changes in content state handling, ensuring consistency across implementations.

* feat: Introduce Redis-backed stream services for enhanced job management

- Added createStreamServices function to configure job store and event transport, supporting both Redis and in-memory options.
- Updated GenerationJobManager to allow configuration with custom job stores and event transports, improving flexibility for different deployment scenarios.
- Refactored IJobStore interface to support asynchronous content retrieval, ensuring compatibility with Redis implementations.
- Implemented RedisEventTransport for real-time event delivery across instances, enhancing scalability and responsiveness.
- Updated InMemoryJobStore to align with new async patterns for content and run step retrieval, ensuring consistent behavior across storage options.

* refactor: Remove redundant debug logging in GenerationJobManager and RedisEventTransport

- Eliminated unnecessary debug statements in GenerationJobManager related to subscriber actions and job updates, enhancing log clarity.
- Removed debug logging in RedisEventTransport for subscription and subscriber disconnection events, streamlining the logging output.
- Cleaned up debug messages in RedisJobStore to focus on essential information, improving overall logging efficiency.

* refactor: Enhance job state management and TTL configuration in RedisJobStore

- Updated the RedisJobStore to allow customizable TTL values for job states, improving flexibility in job management.
- Refactored the handling of job expiration and cleanup processes to align with new TTL configurations.
- Simplified the response structure in the chat status endpoint by consolidating state retrieval, enhancing clarity and performance.
- Improved comments and documentation for better understanding of the changes made.

* refactor: cleanupOnComplete option to GenerationJobManager for flexible resource management

- Introduced a new configuration option, cleanupOnComplete, allowing immediate cleanup of event transport and job resources upon job completion.
- Updated completeJob and abortJob methods to respect the cleanupOnComplete setting, enhancing memory management.
- Improved cleanup logic in the cleanup method to handle orphaned resources effectively.
- Enhanced documentation and comments for better clarity on the new functionality.

* refactor: Update TTL configuration for completed jobs in InMemoryJobStore

- Changed the TTL for completed jobs from 5 minutes to 0, allowing for immediate cleanup.
- Enhanced cleanup logic to respect the new TTL setting, improving resource management.
- Updated comments for clarity on the behavior of the TTL configuration.

* refactor: Enhance RedisJobStore with local graph caching for improved performance

- Introduced a local cache for graph references using WeakRef to optimize reconnects for the same instance.
- Updated job deletion and cleanup methods to manage the local cache effectively, ensuring stale entries are removed.
- Enhanced content retrieval methods to prioritize local cache access, reducing Redis round-trips for same-instance reconnects.
- Improved documentation and comments for clarity on the caching mechanism and its benefits.

* feat: Add integration tests for GenerationJobManager, RedisEventTransport, and RedisJobStore, add Redis Cluster support

- Introduced comprehensive integration tests for GenerationJobManager, covering both in-memory and Redis modes to ensure consistent job management and event handling.
- Added tests for RedisEventTransport to validate pub/sub functionality, including cross-instance event delivery and error handling.
- Implemented integration tests for RedisJobStore, focusing on multi-instance job access, content reconstruction from chunks, and consumer group behavior.
- Enhanced test setup and teardown processes to ensure a clean environment for each test run, improving reliability and maintainability.

* fix: Improve error handling in GenerationJobManager for allSubscribersLeft handlers

- Enhanced the error handling logic when retrieving content parts for allSubscribersLeft handlers, ensuring that any failures are logged appropriately.
- Updated the promise chain to catch errors from getContentParts, improving robustness and clarity in error reporting.

* ci: Improve Redis client disconnection handling in integration tests

- Updated the afterAll cleanup logic in integration tests for GenerationJobManager, RedisEventTransport, and RedisJobStore to use `quit()` for graceful disconnection of the Redis client.
- Added fallback to `disconnect()` if `quit()` fails, enhancing robustness in resource management during test teardown.
- Improved comments for clarity on the disconnection process and error handling.

* refactor: Enhance GenerationJobManager and event transports for improved resource management

- Updated GenerationJobManager to prevent immediate cleanup of eventTransport upon job completion, allowing final events to transmit fully before cleanup.
- Added orphaned stream cleanup logic in GenerationJobManager to handle streams without corresponding jobs.
- Introduced getTrackedStreamIds method in both InMemoryEventTransport and RedisEventTransport for better management of orphaned streams.
- Improved comments for clarity on resource management and cleanup processes.

* refactor: Update GenerationJobManager and ResumableAgentController for improved event handling

- Modified GenerationJobManager to resolve readyPromise immediately, eliminating startup latency and allowing early event buffering for late subscribers.
- Enhanced event handling logic to replay buffered events when the first subscriber connects, ensuring no events are lost due to race conditions.
- Updated comments for clarity on the new event synchronization mechanism and its benefits in both Redis and in-memory modes.

* fix: Update cache integration test command for stream to ensure proper execution

- Modified the test command for cache integration related to streams by adding the --forceExit flag to prevent hanging tests.
- This change enhances the reliability of the test suite by ensuring all tests complete as expected.

* feat: Add active job management for user and show progress in conversation list

- Implemented a new endpoint to retrieve active generation job IDs for the current user, enhancing user experience by allowing visibility of ongoing tasks.
- Integrated active job tracking in the Conversations component, displaying generation indicators based on active jobs.
- Optimized job management in the GenerationJobManager and InMemoryJobStore to support user-specific job queries, ensuring efficient resource handling and cleanup.
- Updated relevant components and hooks to utilize the new active jobs feature, improving overall application responsiveness and user feedback.

* feat: Implement active job tracking by user in RedisJobStore

- Added functionality to retrieve active job IDs for a specific user, enhancing user experience by allowing visibility of ongoing tasks.
- Implemented self-healing cleanup for stale job entries, ensuring accurate tracking of active jobs.
- Updated job creation, update, and deletion methods to manage user-specific job sets effectively.
- Enhanced integration tests to validate the new user-specific job management features.

* refactor: Simplify job deletion logic by removing user job cleanup from InMemoryJobStore and RedisJobStore

* WIP: Add backend inspect script for easier debugging in production

* refactor: title generation logic

- Changed the title generation endpoint from POST to GET, allowing for more efficient retrieval of titles based on conversation ID.
- Implemented exponential backoff for title fetching retries, improving responsiveness and reducing server load.
- Introduced a queuing mechanism for title generation, ensuring titles are generated only after job completion.
- Updated relevant components and hooks to utilize the new title generation logic, enhancing user experience and application performance.

* feat: Enhance updateConvoInAllQueries to support moving conversations to the top

* chore: temp. remove added multi convo

* refactor: Update active jobs query integration for optimistic updates on abort

- Introduced a new interface for active jobs response to standardize data handling.
- Updated query keys for active jobs to ensure consistency across components.
- Enhanced job management logic in hooks to properly reflect active job states, improving overall application responsiveness.

* refactor: useResumableStreamToggle hook to manage resumable streams for legacy/assistants endpoints

- Introduced a new hook, useResumableStreamToggle, to automatically toggle resumable streams off for assistants endpoints and restore the previous value when switching away.
- Updated ChatView component to utilize the new hook, enhancing the handling of streaming behavior based on endpoint type.
- Refactored imports in ChatView for better organization.

* refactor: streamline conversation title generation handling

- Removed unused type definition for TGenTitleMutation in mutations.ts to clean up the codebase.
- Integrated queueTitleGeneration call in useEventHandlers to trigger title generation for new conversations, enhancing the responsiveness of the application.

* feat: Add USE_REDIS_STREAMS configuration for stream job storage

- Introduced USE_REDIS_STREAMS to control Redis usage for resumable stream job storage, defaulting to true if USE_REDIS is enabled but not explicitly set.
- Updated cacheConfig to include USE_REDIS_STREAMS and modified createStreamServices to utilize this new configuration.
- Enhanced unit tests to validate the behavior of USE_REDIS_STREAMS under various environment settings, ensuring correct defaults and overrides.

* fix: title generation queue management for assistants

- Introduced a queueListeners mechanism to notify changes in the title generation queue, improving responsiveness for non-resumable streams.
- Updated the useTitleGeneration hook to track queue changes with a queueVersion state, ensuring accurate updates when jobs complete.
- Refactored the queueTitleGeneration function to trigger listeners upon adding new conversation IDs, enhancing the overall title generation flow.

* refactor: streamline agent controller and remove legacy resumable handling

- Updated the AgentController to route all requests to ResumableAgentController, simplifying the logic.
- Deprecated the legacy non-resumable path, providing a clear migration path for future use.
- Adjusted setHeaders middleware to remove unnecessary checks for resumable mode.
- Cleaned up the useResumableSSE hook to eliminate redundant query parameters, enhancing clarity and performance.

* feat: Add USE_REDIS_STREAMS configuration to .env.example

- Updated .env.example to include USE_REDIS_STREAMS setting, allowing control over Redis usage for resumable LLM streams.
- Provided additional context on the behavior of USE_REDIS_STREAMS when not explicitly set, enhancing clarity for configuration management.

* refactor: remove unused setHeaders middleware from chat route

- Eliminated the setHeaders middleware from the chat route, streamlining the request handling process.
- This change contributes to cleaner code and improved performance by reducing unnecessary middleware checks.

* fix: Add streamId parameter for resumable stream handling across services (actions, mcp oauth)

* fix(flow): add immediate abort handling and fix intervalId initialization

- Add immediate abort handler that responds instantly to abort signal
- Declare intervalId before cleanup function to prevent 'Cannot access before initialization' error
- Consolidate cleanup logic into single function to avoid duplicate cleanup
- Properly remove abort event listener on cleanup

* fix(mcp): clean up OAuth flows on abort and simplify flow handling

- Add abort handler in reconnectServer to clean up mcp_oauth and mcp_get_tokens flows
- Update createAbortHandler to clean up both flow types on tool call abort
- Pass abort signal to createFlow in returnOnOAuth path
- Simplify handleOAuthRequired to always cancel existing flows and start fresh
- This ensures user always gets a new OAuth URL instead of waiting for stale flows

* fix(agents): handle 'new' conversationId and improve abort reliability

- Treat 'new' as placeholder that needs UUID in request controller
- Send JSON response immediately before tool loading for faster SSE connection
- Use job's abort controller instead of prelimAbortController
- Emit errors to stream if headers already sent
- Skip 'new' as valid ID in abort endpoint
- Add fallback to find active jobs by userId when conversationId is 'new'

* fix(stream): detect early abort and prevent navigation to non-existent conversation

- Abort controller on job completion to signal pending operations
- Detect early abort (no content, no responseMessageId) in abortJob
- Set conversation and responseMessage to null for early aborts
- Add earlyAbort flag to final event for frontend detection
- Remove unused text field from AbortResult interface
- Frontend handles earlyAbort by staying on/navigating to new chat

* test(mcp): update test to expect signal parameter in createFlow

* 🔧 refactor: Update Vertex AI Configuration Handling

- Simplified the logic for enabling Vertex AI in the Anthropic initialization process, ensuring it defaults to enabled unless explicitly set to false.
- Adjusted the Vertex AI schema to make the 'enabled' property optional, defaulting to true when the configuration is present.
- Updated related comments and documentation for clarity on the configuration behavior.

* 🔧 chore: Update Anthropic Configuration and Logging Enhancements

- Changed the default region for Anthropic Vertex AI from 'global' to 'us-east5' in the .env.example file for better regional alignment.
- Added debug logging to handle non-JSON credentials in the Anthropic client, improving error visibility during credential parsing.
- Updated the service key path resolution in the Vertex AI client to use the current working directory, enhancing flexibility in file location.

---------

Co-authored-by: Ziyan <5621658+Ziyann@users.noreply.github.com>
Co-authored-by: Aron Gates <aron@muonspace.com>
Co-authored-by: Danny Avila <danny@librechat.ai>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Joseph Licata 2025-12-30 18:16:52 -05:00 committed by GitHub
parent 716d2a9f3c
commit 90c63a56f3
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
22 changed files with 1386 additions and 60 deletions

View file

@ -124,6 +124,10 @@ ANTHROPIC_API_KEY=user_provided
# ANTHROPIC_MODELS=claude-opus-4-20250514,claude-sonnet-4-20250514,claude-3-7-sonnet-20250219,claude-3-5-sonnet-20241022,claude-3-5-haiku-20241022,claude-3-opus-20240229,claude-3-sonnet-20240229,claude-3-haiku-20240307
# ANTHROPIC_REVERSE_PROXY=
# Set to true to use Anthropic models through Google Vertex AI instead of direct API
# ANTHROPIC_USE_VERTEX=
# ANTHROPIC_VERTEX_REGION=us-east5
#============#
# Azure #
#============#

View file

@ -34,6 +34,8 @@
},
"homepage": "https://librechat.ai",
"dependencies": {
"@anthropic-ai/sdk": "^0.71.0",
"@anthropic-ai/vertex-sdk": "^0.14.0",
"@aws-sdk/client-bedrock-runtime": "^3.941.0",
"@aws-sdk/client-s3": "^3.758.0",
"@aws-sdk/s3-request-presigner": "^3.758.0",

View file

@ -250,9 +250,7 @@ class AgentClient extends BaseClient {
return this.contentParts;
}
setOptions(options) {
logger.info('[api/server/controllers/agents/client.js] setOptions', options);
}
setOptions(_options) {}
/**
* `AgentClient` is not opinionated about vision requests, so we don't do anything here
@ -747,10 +745,16 @@ class AgentClient extends BaseClient {
if (!collectedUsage || !collectedUsage.length) {
return;
}
// Support both OpenAI format (input_token_details) and Anthropic format (cache_*_input_tokens)
const firstUsage = collectedUsage[0];
const input_tokens =
(collectedUsage[0]?.input_tokens || 0) +
(Number(collectedUsage[0]?.input_token_details?.cache_creation) || 0) +
(Number(collectedUsage[0]?.input_token_details?.cache_read) || 0);
(firstUsage?.input_tokens || 0) +
(Number(firstUsage?.input_token_details?.cache_creation) ||
Number(firstUsage?.cache_creation_input_tokens) ||
0) +
(Number(firstUsage?.input_token_details?.cache_read) ||
Number(firstUsage?.cache_read_input_tokens) ||
0);
let output_tokens = 0;
let previousTokens = input_tokens; // Start with original input
@ -760,8 +764,13 @@ class AgentClient extends BaseClient {
continue;
}
const cache_creation = Number(usage.input_token_details?.cache_creation) || 0;
const cache_read = Number(usage.input_token_details?.cache_read) || 0;
// Support both OpenAI format (input_token_details) and Anthropic format (cache_*_input_tokens)
const cache_creation =
Number(usage.input_token_details?.cache_creation) ||
Number(usage.cache_creation_input_tokens) ||
0;
const cache_read =
Number(usage.input_token_details?.cache_read) || Number(usage.cache_read_input_tokens) || 0;
const txMetadata = {
context,

View file

@ -1,4 +1,4 @@
const { isUserProvided } = require('@librechat/api');
const { isUserProvided, isEnabled } = require('@librechat/api');
const { EModelEndpoint } = require('librechat-data-provider');
const { generateConfig } = require('~/server/utils/handleText');
@ -23,7 +23,9 @@ module.exports = {
openAIApiKey,
azureOpenAIApiKey,
userProvidedOpenAI,
[EModelEndpoint.anthropic]: generateConfig(anthropicApiKey),
[EModelEndpoint.anthropic]: generateConfig(
anthropicApiKey || isEnabled(process.env.ANTHROPIC_USE_VERTEX),
),
[EModelEndpoint.openAI]: generateConfig(openAIApiKey, OPENAI_REVERSE_PROXY),
[EModelEndpoint.azureOpenAI]: generateConfig(azureOpenAIApiKey, AZURE_OPENAI_BASEURL),
[EModelEndpoint.assistants]: generateConfig(

View file

@ -43,6 +43,14 @@ async function getEndpointsConfig(req) {
};
}
// Enable Anthropic endpoint when Vertex AI is configured in YAML
if (appConfig.endpoints?.[EModelEndpoint.anthropic]?.vertexConfig?.enabled) {
/** @type {Omit<TConfig, 'order'>} */
mergedConfig[EModelEndpoint.anthropic] = {
userProvide: false,
};
}
if (appConfig.endpoints?.[EModelEndpoint.azureOpenAI]?.assistants) {
/** @type {Omit<TConfig, 'order'>} */
mergedConfig[EModelEndpoint.azureAssistants] = {

View file

@ -6,6 +6,7 @@ const {
getOpenAIModels,
getGoogleModels,
} = require('@librechat/api');
const { getAppConfig } = require('./app');
/**
* Loads the default models for the application.
@ -15,16 +16,21 @@ const {
*/
async function loadDefaultModels(req) {
try {
const appConfig = req.config ?? (await getAppConfig({ role: req.user?.role }));
const vertexConfig = appConfig?.endpoints?.[EModelEndpoint.anthropic]?.vertexConfig;
const [openAI, anthropic, azureOpenAI, assistants, azureAssistants, google, bedrock] =
await Promise.all([
getOpenAIModels({ user: req.user.id }).catch((error) => {
logger.error('Error fetching OpenAI models:', error);
return [];
}),
getAnthropicModels({ user: req.user.id }).catch((error) => {
logger.error('Error fetching Anthropic models:', error);
return [];
}),
getAnthropicModels({ user: req.user.id, vertexModels: vertexConfig?.modelNames }).catch(
(error) => {
logger.error('Error fetching Anthropic models:', error);
return [];
},
),
getOpenAIModels({ user: req.user.id, azure: true }).catch((error) => {
logger.error('Error fetching Azure OpenAI models:', error);
return [];

View file

@ -253,6 +253,67 @@ endpoints:
# minRelevanceScore: 0.45
# # (optional) Agent Capabilities available to all users. Omit the ones you wish to exclude. Defaults to list below.
# capabilities: ["execute_code", "file_search", "actions", "tools"]
# Anthropic endpoint configuration with Vertex AI support
# Use this to run Anthropic Claude models through Google Cloud Vertex AI
# anthropic:
# # (optional) Stream rate limiting in milliseconds
# streamRate: 20
# # (optional) Title model for conversation titles
# titleModel: claude-3.5-haiku # Use the visible model name (key from models config)
#
# # Vertex AI Configuration - enables running Claude models via Google Cloud
# # This is similar to Azure OpenAI but for Anthropic models on Google Cloud
# # Vertex AI is automatically enabled when this config section is present
# vertex:
# # Vertex AI region (optional, defaults to 'us-east5')
# # Available regions: us-east5, us-central1, europe-west1, europe-west4, asia-southeast1
# region: "us-east5"
# # Path to Google service account key file (optional)
# # If not specified, uses GOOGLE_SERVICE_KEY_FILE env var or default path (api/data/auth.json)
# # The project_id is automatically extracted from the service key file
# # serviceKeyFile: "/path/to/service-account.json"
# # Google Cloud Project ID (optional) - auto-detected from service key file
# # Only specify if you need to override the project_id in your service key
# # projectId: "${VERTEX_PROJECT_ID}"
#
# # ============================================================================
# # Model Configuration - Set Visible Model Names and Deployment Mappings
# # Similar to Azure OpenAI model naming pattern
# # ============================================================================
#
# # Option 1: Simple array (legacy format - model name = deployment name)
# # Use this if you want the technical model IDs to show in the UI
# # models:
# # - "claude-sonnet-4-20250514"
# # - "claude-3-7-sonnet-20250219"
# # - "claude-3-5-sonnet-v2@20241022"
# # - "claude-3-5-haiku@20241022"
#
# # Option 2: Object format with custom visible names (RECOMMENDED)
# # The key is the visible model name shown in the UI (can be any name you want)
# # The deploymentName is the actual Vertex AI model ID used for API calls
# # You can use friendly names (avoid spaces for cleaner YAML) or technical IDs as keys
# models:
# claude-opus-4.5:
# deploymentName: claude-opus-4-5@20251101
# claude-sonnet-4:
# deploymentName: claude-sonnet-4-20250514
# claude-3.7-sonnet:
# deploymentName: claude-3-7-sonnet-20250219
# claude-3.5-sonnet:
# deploymentName: claude-3-5-sonnet-v2@20241022
# claude-3.5-haiku:
# deploymentName: claude-3-5-haiku@20241022
#
# # Option 3: Mixed format with default deploymentName
# # Set a default deploymentName and use boolean values for models
# # deploymentName: claude-sonnet-4-20250514
# # models:
# # claude-sonnet-4: true # Will use the default deploymentName
# # claude-3.5-haiku:
# # deploymentName: claude-3-5-haiku@20241022 # Override for this model
custom:
# Groq Example
- name: 'groq'

603
package-lock.json generated
View file

@ -48,6 +48,8 @@
"version": "v0.8.2-rc1",
"license": "ISC",
"dependencies": {
"@anthropic-ai/sdk": "^0.71.0",
"@anthropic-ai/vertex-sdk": "^0.14.0",
"@aws-sdk/client-bedrock-runtime": "^3.941.0",
"@aws-sdk/client-s3": "^3.758.0",
"@aws-sdk/s3-request-presigner": "^3.758.0",
@ -133,6 +135,596 @@
"supertest": "^7.1.0"
}
},
"api/node_modules/@anthropic-ai/sdk": {
"version": "0.71.2",
"resolved": "https://registry.npmjs.org/@anthropic-ai/sdk/-/sdk-0.71.2.tgz",
"integrity": "sha512-TGNDEUuEstk/DKu0/TflXAEt+p+p/WhTlFzEnoosvbaDU2LTjm42igSdlL0VijrKpWejtOKxX0b8A7uc+XiSAQ==",
"license": "MIT",
"dependencies": {
"json-schema-to-ts": "^3.1.1"
},
"bin": {
"anthropic-ai-sdk": "bin/cli"
},
"peerDependencies": {
"zod": "^3.25.0 || ^4.0.0"
},
"peerDependenciesMeta": {
"zod": {
"optional": true
}
}
},
"api/node_modules/@aws-sdk/client-bedrock-runtime": {
"version": "3.941.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/client-bedrock-runtime/-/client-bedrock-runtime-3.941.0.tgz",
"integrity": "sha512-hvOhVSe1OHTh8EvK/rIbURc0KmBSEceVKfF9TrLkwLbvLFZEGsy2y6lHi4CFuH5WYMPU0C1wLWfd2bgkLvsMcA==",
"license": "Apache-2.0",
"dependencies": {
"@aws-crypto/sha256-browser": "5.2.0",
"@aws-crypto/sha256-js": "5.2.0",
"@aws-sdk/core": "3.940.0",
"@aws-sdk/credential-provider-node": "3.940.0",
"@aws-sdk/eventstream-handler-node": "3.936.0",
"@aws-sdk/middleware-eventstream": "3.936.0",
"@aws-sdk/middleware-host-header": "3.936.0",
"@aws-sdk/middleware-logger": "3.936.0",
"@aws-sdk/middleware-recursion-detection": "3.936.0",
"@aws-sdk/middleware-user-agent": "3.940.0",
"@aws-sdk/middleware-websocket": "3.936.0",
"@aws-sdk/region-config-resolver": "3.936.0",
"@aws-sdk/token-providers": "3.940.0",
"@aws-sdk/types": "3.936.0",
"@aws-sdk/util-endpoints": "3.936.0",
"@aws-sdk/util-user-agent-browser": "3.936.0",
"@aws-sdk/util-user-agent-node": "3.940.0",
"@smithy/config-resolver": "^4.4.3",
"@smithy/core": "^3.18.5",
"@smithy/eventstream-serde-browser": "^4.2.5",
"@smithy/eventstream-serde-config-resolver": "^4.3.5",
"@smithy/eventstream-serde-node": "^4.2.5",
"@smithy/fetch-http-handler": "^5.3.6",
"@smithy/hash-node": "^4.2.5",
"@smithy/invalid-dependency": "^4.2.5",
"@smithy/middleware-content-length": "^4.2.5",
"@smithy/middleware-endpoint": "^4.3.12",
"@smithy/middleware-retry": "^4.4.12",
"@smithy/middleware-serde": "^4.2.6",
"@smithy/middleware-stack": "^4.2.5",
"@smithy/node-config-provider": "^4.3.5",
"@smithy/node-http-handler": "^4.4.5",
"@smithy/protocol-http": "^5.3.5",
"@smithy/smithy-client": "^4.9.8",
"@smithy/types": "^4.9.0",
"@smithy/url-parser": "^4.2.5",
"@smithy/util-base64": "^4.3.0",
"@smithy/util-body-length-browser": "^4.2.0",
"@smithy/util-body-length-node": "^4.2.1",
"@smithy/util-defaults-mode-browser": "^4.3.11",
"@smithy/util-defaults-mode-node": "^4.2.14",
"@smithy/util-endpoints": "^3.2.5",
"@smithy/util-middleware": "^4.2.5",
"@smithy/util-retry": "^4.2.5",
"@smithy/util-stream": "^4.5.6",
"@smithy/util-utf8": "^4.2.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/client-sso": {
"version": "3.940.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/client-sso/-/client-sso-3.940.0.tgz",
"integrity": "sha512-SdqJGWVhmIURvCSgkDditHRO+ozubwZk9aCX9MK8qxyOndhobCndW1ozl3hX9psvMAo9Q4bppjuqy/GHWpjB+A==",
"license": "Apache-2.0",
"dependencies": {
"@aws-crypto/sha256-browser": "5.2.0",
"@aws-crypto/sha256-js": "5.2.0",
"@aws-sdk/core": "3.940.0",
"@aws-sdk/middleware-host-header": "3.936.0",
"@aws-sdk/middleware-logger": "3.936.0",
"@aws-sdk/middleware-recursion-detection": "3.936.0",
"@aws-sdk/middleware-user-agent": "3.940.0",
"@aws-sdk/region-config-resolver": "3.936.0",
"@aws-sdk/types": "3.936.0",
"@aws-sdk/util-endpoints": "3.936.0",
"@aws-sdk/util-user-agent-browser": "3.936.0",
"@aws-sdk/util-user-agent-node": "3.940.0",
"@smithy/config-resolver": "^4.4.3",
"@smithy/core": "^3.18.5",
"@smithy/fetch-http-handler": "^5.3.6",
"@smithy/hash-node": "^4.2.5",
"@smithy/invalid-dependency": "^4.2.5",
"@smithy/middleware-content-length": "^4.2.5",
"@smithy/middleware-endpoint": "^4.3.12",
"@smithy/middleware-retry": "^4.4.12",
"@smithy/middleware-serde": "^4.2.6",
"@smithy/middleware-stack": "^4.2.5",
"@smithy/node-config-provider": "^4.3.5",
"@smithy/node-http-handler": "^4.4.5",
"@smithy/protocol-http": "^5.3.5",
"@smithy/smithy-client": "^4.9.8",
"@smithy/types": "^4.9.0",
"@smithy/url-parser": "^4.2.5",
"@smithy/util-base64": "^4.3.0",
"@smithy/util-body-length-browser": "^4.2.0",
"@smithy/util-body-length-node": "^4.2.1",
"@smithy/util-defaults-mode-browser": "^4.3.11",
"@smithy/util-defaults-mode-node": "^4.2.14",
"@smithy/util-endpoints": "^3.2.5",
"@smithy/util-middleware": "^4.2.5",
"@smithy/util-retry": "^4.2.5",
"@smithy/util-utf8": "^4.2.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/core": {
"version": "3.940.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/core/-/core-3.940.0.tgz",
"integrity": "sha512-KsGD2FLaX5ngJao1mHxodIVU9VYd1E8810fcYiGwO1PFHDzf5BEkp6D9IdMeQwT8Q6JLYtiiT1Y/o3UCScnGoA==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/types": "3.936.0",
"@aws-sdk/xml-builder": "3.930.0",
"@smithy/core": "^3.18.5",
"@smithy/node-config-provider": "^4.3.5",
"@smithy/property-provider": "^4.2.5",
"@smithy/protocol-http": "^5.3.5",
"@smithy/signature-v4": "^5.3.5",
"@smithy/smithy-client": "^4.9.8",
"@smithy/types": "^4.9.0",
"@smithy/util-base64": "^4.3.0",
"@smithy/util-middleware": "^4.2.5",
"@smithy/util-utf8": "^4.2.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/credential-provider-env": {
"version": "3.940.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/credential-provider-env/-/credential-provider-env-3.940.0.tgz",
"integrity": "sha512-/G3l5/wbZYP2XEQiOoIkRJmlv15f1P3MSd1a0gz27lHEMrOJOGq66rF1Ca4OJLzapWt3Fy9BPrZAepoAX11kMw==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/core": "3.940.0",
"@aws-sdk/types": "3.936.0",
"@smithy/property-provider": "^4.2.5",
"@smithy/types": "^4.9.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/credential-provider-http": {
"version": "3.940.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/credential-provider-http/-/credential-provider-http-3.940.0.tgz",
"integrity": "sha512-dOrc03DHElNBD6N9Okt4U0zhrG4Wix5QUBSZPr5VN8SvmjD9dkrrxOkkJaMCl/bzrW7kbQEp7LuBdbxArMmOZQ==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/core": "3.940.0",
"@aws-sdk/types": "3.936.0",
"@smithy/fetch-http-handler": "^5.3.6",
"@smithy/node-http-handler": "^4.4.5",
"@smithy/property-provider": "^4.2.5",
"@smithy/protocol-http": "^5.3.5",
"@smithy/smithy-client": "^4.9.8",
"@smithy/types": "^4.9.0",
"@smithy/util-stream": "^4.5.6",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/credential-provider-ini": {
"version": "3.940.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/credential-provider-ini/-/credential-provider-ini-3.940.0.tgz",
"integrity": "sha512-gn7PJQEzb/cnInNFTOaDoCN/hOKqMejNmLof1W5VW95Qk0TPO52lH8R4RmJPnRrwFMswOWswTOpR1roKNLIrcw==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/core": "3.940.0",
"@aws-sdk/credential-provider-env": "3.940.0",
"@aws-sdk/credential-provider-http": "3.940.0",
"@aws-sdk/credential-provider-login": "3.940.0",
"@aws-sdk/credential-provider-process": "3.940.0",
"@aws-sdk/credential-provider-sso": "3.940.0",
"@aws-sdk/credential-provider-web-identity": "3.940.0",
"@aws-sdk/nested-clients": "3.940.0",
"@aws-sdk/types": "3.936.0",
"@smithy/credential-provider-imds": "^4.2.5",
"@smithy/property-provider": "^4.2.5",
"@smithy/shared-ini-file-loader": "^4.4.0",
"@smithy/types": "^4.9.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/credential-provider-node": {
"version": "3.940.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/credential-provider-node/-/credential-provider-node-3.940.0.tgz",
"integrity": "sha512-M8NFAvgvO6xZjiti5kztFiAYmSmSlG3eUfr4ZHSfXYZUA/KUdZU/D6xJyaLnU8cYRWBludb6K9XPKKVwKfqm4g==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/credential-provider-env": "3.940.0",
"@aws-sdk/credential-provider-http": "3.940.0",
"@aws-sdk/credential-provider-ini": "3.940.0",
"@aws-sdk/credential-provider-process": "3.940.0",
"@aws-sdk/credential-provider-sso": "3.940.0",
"@aws-sdk/credential-provider-web-identity": "3.940.0",
"@aws-sdk/types": "3.936.0",
"@smithy/credential-provider-imds": "^4.2.5",
"@smithy/property-provider": "^4.2.5",
"@smithy/shared-ini-file-loader": "^4.4.0",
"@smithy/types": "^4.9.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/credential-provider-process": {
"version": "3.940.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/credential-provider-process/-/credential-provider-process-3.940.0.tgz",
"integrity": "sha512-pILBzt5/TYCqRsJb7vZlxmRIe0/T+FZPeml417EK75060ajDGnVJjHcuVdLVIeKoTKm9gmJc9l45gon6PbHyUQ==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/core": "3.940.0",
"@aws-sdk/types": "3.936.0",
"@smithy/property-provider": "^4.2.5",
"@smithy/shared-ini-file-loader": "^4.4.0",
"@smithy/types": "^4.9.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/credential-provider-sso": {
"version": "3.940.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/credential-provider-sso/-/credential-provider-sso-3.940.0.tgz",
"integrity": "sha512-q6JMHIkBlDCOMnA3RAzf8cGfup+8ukhhb50fNpghMs1SNBGhanmaMbZSgLigBRsPQW7fOk2l8jnzdVLS+BB9Uw==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/client-sso": "3.940.0",
"@aws-sdk/core": "3.940.0",
"@aws-sdk/token-providers": "3.940.0",
"@aws-sdk/types": "3.936.0",
"@smithy/property-provider": "^4.2.5",
"@smithy/shared-ini-file-loader": "^4.4.0",
"@smithy/types": "^4.9.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/credential-provider-web-identity": {
"version": "3.940.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/credential-provider-web-identity/-/credential-provider-web-identity-3.940.0.tgz",
"integrity": "sha512-9QLTIkDJHHaYL0nyymO41H8g3ui1yz6Y3GmAN1gYQa6plXisuFBnGAbmKVj7zNvjWaOKdF0dV3dd3AFKEDoJ/w==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/core": "3.940.0",
"@aws-sdk/nested-clients": "3.940.0",
"@aws-sdk/types": "3.936.0",
"@smithy/property-provider": "^4.2.5",
"@smithy/shared-ini-file-loader": "^4.4.0",
"@smithy/types": "^4.9.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/eventstream-handler-node": {
"version": "3.936.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/eventstream-handler-node/-/eventstream-handler-node-3.936.0.tgz",
"integrity": "sha512-4zIbhdRmol2KosIHmU31ATvNP0tkJhDlRj9GuawVJoEnMvJA1pd2U3SRdiOImJU3j8pT46VeS4YMmYxfjGHByg==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/types": "3.936.0",
"@smithy/eventstream-codec": "^4.2.5",
"@smithy/types": "^4.9.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/middleware-eventstream": {
"version": "3.936.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/middleware-eventstream/-/middleware-eventstream-3.936.0.tgz",
"integrity": "sha512-XQSH8gzLkk8CDUDxyt4Rdm9owTpRIPdtg2yw9Y2Wl5iSI55YQSiC3x8nM3c4Y4WqReJprunFPK225ZUDoYCfZA==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/types": "3.936.0",
"@smithy/protocol-http": "^5.3.5",
"@smithy/types": "^4.9.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/middleware-host-header": {
"version": "3.936.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/middleware-host-header/-/middleware-host-header-3.936.0.tgz",
"integrity": "sha512-tAaObaAnsP1XnLGndfkGWFuzrJYuk9W0b/nLvol66t8FZExIAf/WdkT2NNAWOYxljVs++oHnyHBCxIlaHrzSiw==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/types": "3.936.0",
"@smithy/protocol-http": "^5.3.5",
"@smithy/types": "^4.9.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/middleware-logger": {
"version": "3.936.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/middleware-logger/-/middleware-logger-3.936.0.tgz",
"integrity": "sha512-aPSJ12d3a3Ea5nyEnLbijCaaYJT2QjQ9iW+zGh5QcZYXmOGWbKVyPSxmVOboZQG+c1M8t6d2O7tqrwzIq8L8qw==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/types": "3.936.0",
"@smithy/types": "^4.9.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/middleware-recursion-detection": {
"version": "3.936.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/middleware-recursion-detection/-/middleware-recursion-detection-3.936.0.tgz",
"integrity": "sha512-l4aGbHpXM45YNgXggIux1HgsCVAvvBoqHPkqLnqMl9QVapfuSTjJHfDYDsx1Xxct6/m7qSMUzanBALhiaGO2fA==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/types": "3.936.0",
"@aws/lambda-invoke-store": "^0.2.0",
"@smithy/protocol-http": "^5.3.5",
"@smithy/types": "^4.9.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/middleware-user-agent": {
"version": "3.940.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/middleware-user-agent/-/middleware-user-agent-3.940.0.tgz",
"integrity": "sha512-nJbLrUj6fY+l2W2rIB9P4Qvpiy0tnTdg/dmixRxrU1z3e8wBdspJlyE+AZN4fuVbeL6rrRrO/zxQC1bB3cw5IA==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/core": "3.940.0",
"@aws-sdk/types": "3.936.0",
"@aws-sdk/util-endpoints": "3.936.0",
"@smithy/core": "^3.18.5",
"@smithy/protocol-http": "^5.3.5",
"@smithy/types": "^4.9.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/middleware-websocket": {
"version": "3.936.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/middleware-websocket/-/middleware-websocket-3.936.0.tgz",
"integrity": "sha512-bPe3rqeugyj/MmjP0yBSZox2v1Wa8Dv39KN+RxVbQroLO8VUitBo6xyZ0oZebhZ5sASwSg58aDcMlX0uFLQnTA==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/types": "3.936.0",
"@aws-sdk/util-format-url": "3.936.0",
"@smithy/eventstream-codec": "^4.2.5",
"@smithy/eventstream-serde-browser": "^4.2.5",
"@smithy/fetch-http-handler": "^5.3.6",
"@smithy/protocol-http": "^5.3.5",
"@smithy/signature-v4": "^5.3.5",
"@smithy/types": "^4.9.0",
"@smithy/util-hex-encoding": "^4.2.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">= 14.0.0"
}
},
"api/node_modules/@aws-sdk/nested-clients": {
"version": "3.940.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/nested-clients/-/nested-clients-3.940.0.tgz",
"integrity": "sha512-x0mdv6DkjXqXEcQj3URbCltEzW6hoy/1uIL+i8gExP6YKrnhiZ7SzuB4gPls2UOpK5UqLiqXjhRLfBb1C9i4Dw==",
"license": "Apache-2.0",
"dependencies": {
"@aws-crypto/sha256-browser": "5.2.0",
"@aws-crypto/sha256-js": "5.2.0",
"@aws-sdk/core": "3.940.0",
"@aws-sdk/middleware-host-header": "3.936.0",
"@aws-sdk/middleware-logger": "3.936.0",
"@aws-sdk/middleware-recursion-detection": "3.936.0",
"@aws-sdk/middleware-user-agent": "3.940.0",
"@aws-sdk/region-config-resolver": "3.936.0",
"@aws-sdk/types": "3.936.0",
"@aws-sdk/util-endpoints": "3.936.0",
"@aws-sdk/util-user-agent-browser": "3.936.0",
"@aws-sdk/util-user-agent-node": "3.940.0",
"@smithy/config-resolver": "^4.4.3",
"@smithy/core": "^3.18.5",
"@smithy/fetch-http-handler": "^5.3.6",
"@smithy/hash-node": "^4.2.5",
"@smithy/invalid-dependency": "^4.2.5",
"@smithy/middleware-content-length": "^4.2.5",
"@smithy/middleware-endpoint": "^4.3.12",
"@smithy/middleware-retry": "^4.4.12",
"@smithy/middleware-serde": "^4.2.6",
"@smithy/middleware-stack": "^4.2.5",
"@smithy/node-config-provider": "^4.3.5",
"@smithy/node-http-handler": "^4.4.5",
"@smithy/protocol-http": "^5.3.5",
"@smithy/smithy-client": "^4.9.8",
"@smithy/types": "^4.9.0",
"@smithy/url-parser": "^4.2.5",
"@smithy/util-base64": "^4.3.0",
"@smithy/util-body-length-browser": "^4.2.0",
"@smithy/util-body-length-node": "^4.2.1",
"@smithy/util-defaults-mode-browser": "^4.3.11",
"@smithy/util-defaults-mode-node": "^4.2.14",
"@smithy/util-endpoints": "^3.2.5",
"@smithy/util-middleware": "^4.2.5",
"@smithy/util-retry": "^4.2.5",
"@smithy/util-utf8": "^4.2.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/region-config-resolver": {
"version": "3.936.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/region-config-resolver/-/region-config-resolver-3.936.0.tgz",
"integrity": "sha512-wOKhzzWsshXGduxO4pqSiNyL9oUtk4BEvjWm9aaq6Hmfdoydq6v6t0rAGHWPjFwy9z2haovGRi3C8IxdMB4muw==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/types": "3.936.0",
"@smithy/config-resolver": "^4.4.3",
"@smithy/node-config-provider": "^4.3.5",
"@smithy/types": "^4.9.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/token-providers": {
"version": "3.940.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/token-providers/-/token-providers-3.940.0.tgz",
"integrity": "sha512-k5qbRe/ZFjW9oWEdzLIa2twRVIEx7p/9rutofyrRysrtEnYh3HAWCngAnwbgKMoiwa806UzcTRx0TjyEpnKcCg==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/core": "3.940.0",
"@aws-sdk/nested-clients": "3.940.0",
"@aws-sdk/types": "3.936.0",
"@smithy/property-provider": "^4.2.5",
"@smithy/shared-ini-file-loader": "^4.4.0",
"@smithy/types": "^4.9.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/types": {
"version": "3.936.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/types/-/types-3.936.0.tgz",
"integrity": "sha512-uz0/VlMd2pP5MepdrHizd+T+OKfyK4r3OA9JI+L/lPKg0YFQosdJNCKisr6o70E3dh8iMpFYxF1UN/4uZsyARg==",
"license": "Apache-2.0",
"dependencies": {
"@smithy/types": "^4.9.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/util-endpoints": {
"version": "3.936.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-endpoints/-/util-endpoints-3.936.0.tgz",
"integrity": "sha512-0Zx3Ntdpu+z9Wlm7JKUBOzS9EunwKAb4KdGUQQxDqh5Lc3ta5uBoub+FgmVuzwnmBu9U1Os8UuwVTH0Lgu+P5w==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/types": "3.936.0",
"@smithy/types": "^4.9.0",
"@smithy/url-parser": "^4.2.5",
"@smithy/util-endpoints": "^3.2.5",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/util-format-url": {
"version": "3.936.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-format-url/-/util-format-url-3.936.0.tgz",
"integrity": "sha512-MS5eSEtDUFIAMHrJaMERiHAvDPdfxc/T869ZjDNFAIiZhyc037REw0aoTNeimNXDNy2txRNZJaAUn/kE4RwN+g==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/types": "3.936.0",
"@smithy/querystring-builder": "^4.2.5",
"@smithy/types": "^4.9.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws-sdk/util-user-agent-browser": {
"version": "3.936.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-user-agent-browser/-/util-user-agent-browser-3.936.0.tgz",
"integrity": "sha512-eZ/XF6NxMtu+iCma58GRNRxSq4lHo6zHQLOZRIeL/ghqYJirqHdenMOwrzPettj60KWlv827RVebP9oNVrwZbw==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/types": "3.936.0",
"@smithy/types": "^4.9.0",
"bowser": "^2.11.0",
"tslib": "^2.6.2"
}
},
"api/node_modules/@aws-sdk/util-user-agent-node": {
"version": "3.940.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-user-agent-node/-/util-user-agent-node-3.940.0.tgz",
"integrity": "sha512-dlD/F+L/jN26I8Zg5x0oDGJiA+/WEQmnSE27fi5ydvYnpfQLwThtQo9SsNS47XSR/SOULaaoC9qx929rZuo74A==",
"license": "Apache-2.0",
"dependencies": {
"@aws-sdk/middleware-user-agent": "3.940.0",
"@aws-sdk/types": "3.936.0",
"@smithy/node-config-provider": "^4.3.5",
"@smithy/types": "^4.9.0",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
},
"peerDependencies": {
"aws-crt": ">=1.0.0"
},
"peerDependenciesMeta": {
"aws-crt": {
"optional": true
}
}
},
"api/node_modules/@aws-sdk/xml-builder": {
"version": "3.930.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/xml-builder/-/xml-builder-3.930.0.tgz",
"integrity": "sha512-YIfkD17GocxdmlUVc3ia52QhcWuRIUJonbF8A2CYfcWNV3HzvAqpcPeC0bYUhkK+8e8YO1ARnLKZQE0TlwzorA==",
"license": "Apache-2.0",
"dependencies": {
"@smithy/types": "^4.9.0",
"fast-xml-parser": "5.2.5",
"tslib": "^2.6.2"
},
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@aws/lambda-invoke-store": {
"version": "0.2.1",
"resolved": "https://registry.npmjs.org/@aws/lambda-invoke-store/-/lambda-invoke-store-0.2.1.tgz",
"integrity": "sha512-sIyFcoPZkTtNu9xFeEoynMef3bPJIAbOfUh+ueYcfhVl6xm2VRtMcMclSxmZCMnHHd4hlYKJeq/aggmBEWynww==",
"license": "Apache-2.0",
"engines": {
"node": ">=18.0.0"
}
},
"api/node_modules/@node-saml/node-saml": {
"version": "5.1.0",
"resolved": "https://registry.npmjs.org/@node-saml/node-saml/-/node-saml-5.1.0.tgz",
@ -2097,6 +2689,16 @@
}
}
},
"node_modules/@anthropic-ai/vertex-sdk": {
"version": "0.14.0",
"resolved": "https://registry.npmjs.org/@anthropic-ai/vertex-sdk/-/vertex-sdk-0.14.0.tgz",
"integrity": "sha512-YIonqYEwQ9ILvpeOUBRBCv+91nzIs/MAZIAJ6yyJ3muwoTbZdEu54A2HcM4nRHH+Gy1vxz0FVau6aGSayXNeWQ==",
"license": "MIT",
"dependencies": {
"@anthropic-ai/sdk": ">=0.50.3 <1",
"google-auth-library": "^9.4.2"
}
},
"node_modules/@ariakit/core": {
"version": "0.4.15",
"resolved": "https://registry.npmjs.org/@ariakit/core/-/core-0.4.15.tgz",
@ -43015,6 +43617,7 @@
"typescript": "^5.0.4"
},
"peerDependencies": {
"@anthropic-ai/vertex-sdk": "^0.14.0",
"@aws-sdk/client-bedrock-runtime": "^3.941.0",
"@aws-sdk/client-s3": "^3.758.0",
"@azure/identity": "^4.7.0",

View file

@ -134,6 +134,7 @@
"axios": "1.12.1",
"elliptic": "^6.6.1",
"form-data": "^4.0.4",
"tslib": "^2.8.1",
"mdast-util-gfm-autolink-literal": "2.0.0",
"remark-gfm": {
"mdast-util-gfm-autolink-literal": "2.0.0"

View file

@ -79,6 +79,7 @@
"registry": "https://registry.npmjs.org/"
},
"peerDependencies": {
"@anthropic-ai/vertex-sdk": "^0.14.0",
"@aws-sdk/client-bedrock-runtime": "^3.941.0",
"@aws-sdk/client-s3": "^3.758.0",
"@azure/identity": "^4.7.0",

View file

@ -1,3 +1,4 @@
export * from './helpers';
export * from './llm';
export * from './vertex';
export * from './initialize';

View file

@ -1,14 +1,16 @@
import { EModelEndpoint } from 'librechat-data-provider';
import { EModelEndpoint, AuthKeys } from 'librechat-data-provider';
import type { BaseInitializeParams, InitializeResultBase, AnthropicConfigOptions } from '~/types';
import { checkUserKeyExpiry } from '~/utils';
import { checkUserKeyExpiry, isEnabled } from '~/utils';
import { loadAnthropicVertexCredentials, getVertexCredentialOptions } from './vertex';
import { getLLMConfig } from './llm';
/**
* Initializes Anthropic endpoint configuration.
* Supports both direct API key authentication and Google Cloud Vertex AI.
*
* @param params - Configuration parameters
* @returns Promise resolving to Anthropic configuration options
* @throws Error if API key is not provided
* @throws Error if API key is not provided (when not using Vertex AI)
*/
export async function initializeAnthropic({
req,
@ -20,45 +22,66 @@ export async function initializeAnthropic({
const appConfig = req.config;
const { ANTHROPIC_API_KEY, ANTHROPIC_REVERSE_PROXY, PROXY } = process.env;
const { key: expiresAt } = req.body;
const isUserProvided = ANTHROPIC_API_KEY === 'user_provided';
const anthropicApiKey = isUserProvided
? await db.getUserKey({ userId: req.user?.id ?? '', name: EModelEndpoint.anthropic })
: ANTHROPIC_API_KEY;
let credentials: Record<string, unknown> = {};
let vertexOptions: { region?: string; projectId?: string } | undefined;
if (!anthropicApiKey) {
throw new Error('Anthropic API key not provided. Please provide it again.');
/** @type {undefined | import('librechat-data-provider').TVertexAIConfig} */
const vertexConfig = appConfig?.endpoints?.[EModelEndpoint.anthropic]?.vertexConfig;
// Check for Vertex AI configuration: YAML config takes priority over env var
// When vertexConfig exists and enabled is not explicitly false, Vertex AI is enabled
const useVertexAI =
(vertexConfig && vertexConfig.enabled !== false) || isEnabled(process.env.ANTHROPIC_USE_VERTEX);
if (useVertexAI) {
// Load credentials with optional YAML config overrides
const credentialOptions = vertexConfig ? getVertexCredentialOptions(vertexConfig) : undefined;
credentials = await loadAnthropicVertexCredentials(credentialOptions);
// Store vertex options for client creation
if (vertexConfig) {
vertexOptions = {
region: vertexConfig.region,
projectId: vertexConfig.projectId,
};
}
} else {
const isUserProvided = ANTHROPIC_API_KEY === 'user_provided';
const anthropicApiKey = isUserProvided
? await db.getUserKey({ userId: req.user?.id ?? '', name: EModelEndpoint.anthropic })
: ANTHROPIC_API_KEY;
if (!anthropicApiKey) {
throw new Error('Anthropic API key not provided. Please provide it again.');
}
if (expiresAt && isUserProvided) {
checkUserKeyExpiry(expiresAt, EModelEndpoint.anthropic);
}
credentials[AuthKeys.ANTHROPIC_API_KEY] = anthropicApiKey;
}
if (expiresAt && isUserProvided) {
checkUserKeyExpiry(expiresAt, EModelEndpoint.anthropic);
}
let clientOptions: AnthropicConfigOptions = {};
/** @type {undefined | TBaseEndpoint} */
const anthropicConfig = appConfig?.endpoints?.[EModelEndpoint.anthropic];
if (anthropicConfig) {
clientOptions = {
...clientOptions,
// Note: _lc_stream_delay is set on modelOptions in the result
};
}
const allConfig = appConfig?.endpoints?.all;
clientOptions = {
const clientOptions: AnthropicConfigOptions = {
proxy: PROXY ?? undefined,
reverseProxyUrl: ANTHROPIC_REVERSE_PROXY ?? undefined,
modelOptions: {
...(model_parameters ?? {}),
user: req.user?.id,
},
...clientOptions,
// Pass Vertex AI options if configured
...(vertexOptions && { vertexOptions }),
// Pass full Vertex AI config including model mappings
...(vertexConfig && { vertexConfig }),
};
const result = getLLMConfig(anthropicApiKey, clientOptions);
/** @type {undefined | TBaseEndpoint} */
const anthropicConfig = appConfig?.endpoints?.[EModelEndpoint.anthropic];
const allConfig = appConfig?.endpoints?.all;
const result = getLLMConfig(credentials, clientOptions);
// Apply stream rate delay
if (anthropicConfig?.streamRate) {

View file

@ -238,9 +238,12 @@ describe('getLLMConfig', () => {
});
describe('Edge cases', () => {
it('should handle missing apiKey', () => {
const result = getLLMConfig(undefined, { modelOptions: {} });
expect(result.llmConfig).not.toHaveProperty('apiKey');
it('should throw error when missing credentials', () => {
expect(() => {
getLLMConfig(undefined, { modelOptions: {} });
}).toThrow(
'Invalid credentials provided. Please provide either a valid Anthropic API key or service account credentials for Vertex AI.',
);
});
it('should handle empty modelOptions', () => {

View file

@ -1,8 +1,40 @@
import { Dispatcher, ProxyAgent } from 'undici';
import { logger } from '@librechat/data-schemas';
import { AnthropicClientOptions } from '@librechat/agents';
import { anthropicSettings, removeNullishValues } from 'librechat-data-provider';
import type { AnthropicLLMConfigResult, AnthropicConfigOptions } from '~/types/anthropic';
import { anthropicSettings, removeNullishValues, AuthKeys } from 'librechat-data-provider';
import type {
AnthropicLLMConfigResult,
AnthropicConfigOptions,
AnthropicCredentials,
} from '~/types/anthropic';
import { checkPromptCacheSupport, getClaudeHeaders, configureReasoning } from './helpers';
import {
createAnthropicVertexClient,
isAnthropicVertexCredentials,
getVertexDeploymentName,
} from './vertex';
/**
* Parses credentials from string or object format
* - If a valid JSON string is passed, it parses and returns the object
* - If a plain API key string is passed, it wraps it in an AnthropicCredentials object
* - If an object is passed, it returns it directly
* - If undefined, returns an empty object
*/
function parseCredentials(
credentials: string | AnthropicCredentials | undefined,
): AnthropicCredentials {
if (typeof credentials === 'string') {
try {
return JSON.parse(credentials);
} catch {
// If not valid JSON, treat as a plain API key
logger.debug('[Anthropic] Credentials not JSON, treating as API key');
return { [AuthKeys.ANTHROPIC_API_KEY]: credentials };
}
}
return credentials && typeof credentials === 'object' ? credentials : {};
}
/** Known Anthropic parameters that map directly to the client config */
export const knownAnthropicParams = new Set([
@ -38,13 +70,13 @@ function applyDefaultParams(target: Record<string, unknown>, defaults: Record<st
/**
* Generates configuration options for creating an Anthropic language model (LLM) instance.
* @param apiKey - The API key for authentication with Anthropic.
* @param credentials - The API key for authentication with Anthropic, or credentials object for Vertex AI.
* @param options={} - Additional options for configuring the LLM.
* @returns Configuration options for creating an Anthropic LLM instance, with null and undefined values removed.
*/
function getLLMConfig(
apiKey?: string,
options: AnthropicConfigOptions = {} as AnthropicConfigOptions,
credentials: string | AnthropicCredentials | undefined,
options: AnthropicConfigOptions = {},
): AnthropicLLMConfigResult {
const systemOptions = {
thinking: options.modelOptions?.thinking ?? anthropicSettings.thinking.default,
@ -74,7 +106,6 @@ function getLLMConfig(
let enableWebSearch = mergedOptions.web_search;
let requestOptions: AnthropicClientOptions & { stream?: boolean } = {
apiKey,
model: mergedOptions.model,
stream: mergedOptions.stream,
temperature: mergedOptions.temperature,
@ -89,6 +120,29 @@ function getLLMConfig(
},
};
const creds = parseCredentials(credentials);
const apiKey = creds[AuthKeys.ANTHROPIC_API_KEY] ?? null;
if (isAnthropicVertexCredentials(creds)) {
// Vertex AI configuration - use custom client with optional YAML config
// Map the visible model name to the actual deployment name for Vertex AI
const deploymentName = getVertexDeploymentName(
requestOptions.model ?? '',
options.vertexConfig,
);
requestOptions.model = deploymentName;
requestOptions.createClient = () =>
createAnthropicVertexClient(creds, requestOptions.clientOptions, options.vertexOptions);
} else if (apiKey) {
// Direct API configuration
requestOptions.apiKey = apiKey;
} else {
throw new Error(
'Invalid credentials provided. Please provide either a valid Anthropic API key or service account credentials for Vertex AI.',
);
}
requestOptions = configureReasoning(requestOptions, systemOptions);
if (!/claude-3[-.]7/.test(mergedOptions.model)) {
@ -180,6 +234,17 @@ function getLLMConfig(
type: 'web_search_20250305',
name: 'web_search',
});
if (isAnthropicVertexCredentials(creds)) {
if (!requestOptions.clientOptions) {
requestOptions.clientOptions = {};
}
requestOptions.clientOptions.defaultHeaders = {
...requestOptions.clientOptions.defaultHeaders,
'anthropic-beta': 'web-search-2025-03-05',
};
}
}
return {

View file

@ -0,0 +1,222 @@
import path from 'path';
import { AnthropicVertex } from '@anthropic-ai/vertex-sdk';
import { GoogleAuth } from 'google-auth-library';
import { ClientOptions } from '@anthropic-ai/sdk';
import { AuthKeys } from 'librechat-data-provider';
import { loadServiceKey } from '~/utils/key';
import type { AnthropicCredentials, VertexAIClientOptions } from '~/types/anthropic';
/**
* Options for loading Vertex AI credentials
*/
export interface VertexCredentialOptions {
/** Path to service account key file (overrides env var) */
serviceKeyFile?: string;
/** Project ID for Vertex AI */
projectId?: string;
/** Region for Vertex AI */
region?: string;
}
/**
* Interface for Vertex AI configuration from YAML config.
* This matches the TVertexAISchema from librechat-data-provider.
*/
export interface VertexAIConfigInput {
enabled?: boolean;
projectId?: string;
region?: string;
serviceKeyFile?: string;
deploymentName?: string;
models?: string[] | Record<string, boolean | { deploymentName?: string }>;
}
/**
* Loads Google service account configuration for Vertex AI.
* Supports both YAML configuration and environment variables.
* @param options - Optional configuration from YAML or other sources
*/
export async function loadAnthropicVertexCredentials(
options?: VertexCredentialOptions,
): Promise<AnthropicCredentials> {
/** Path priority: options > env var > default location */
const serviceKeyPath =
options?.serviceKeyFile ||
process.env.GOOGLE_SERVICE_KEY_FILE ||
path.join(process.cwd(), 'api', 'data', 'auth.json');
const serviceKey = await loadServiceKey(serviceKeyPath);
if (!serviceKey) {
throw new Error(
`Google service account not found or could not be loaded from ${serviceKeyPath}`,
);
}
return {
[AuthKeys.GOOGLE_SERVICE_KEY]: serviceKey,
};
}
/**
* Creates Vertex credential options from a Vertex AI configuration object.
* @param config - The Vertex AI configuration (from YAML config or other sources)
*/
export function getVertexCredentialOptions(config?: VertexAIConfigInput): VertexCredentialOptions {
return {
serviceKeyFile: config?.serviceKeyFile,
projectId: config?.projectId,
region: config?.region,
};
}
/**
* Checks if credentials are for Vertex AI (has service account key but no API key)
*/
export function isAnthropicVertexCredentials(credentials: AnthropicCredentials): boolean {
return !!credentials[AuthKeys.GOOGLE_SERVICE_KEY] && !credentials[AuthKeys.ANTHROPIC_API_KEY];
}
/**
* Filters anthropic-beta header values to only include those supported by Vertex AI.
* Vertex AI rejects prompt-caching-2024-07-31 but we use 'prompt-caching-vertex' as a
* marker to trigger cache_control application in the agents package.
*/
function filterVertexHeaders(headers?: Record<string, string>): Record<string, string> | undefined {
if (!headers) {
return undefined;
}
const filteredHeaders = { ...headers };
const anthropicBeta = filteredHeaders['anthropic-beta'];
if (anthropicBeta) {
// Filter out unsupported beta values for Vertex AI
const supportedValues = anthropicBeta
.split(',')
.map((v) => v.trim())
.filter((v) => {
// Remove prompt-caching headers (Vertex handles caching via cache_control in body)
if (v.includes('prompt-caching')) {
return false;
}
// Remove max-tokens headers (Vertex has its own limits)
if (v.includes('max-tokens')) {
return false;
}
// Remove output-128k headers
if (v.includes('output-128k')) {
return false;
}
// Remove token-efficient-tools headers
if (v.includes('token-efficient-tools')) {
return false;
}
// Remove context-1m headers
if (v.includes('context-1m')) {
return false;
}
return true;
});
if (supportedValues.length > 0) {
filteredHeaders['anthropic-beta'] = supportedValues.join(',');
} else {
delete filteredHeaders['anthropic-beta'];
}
}
return Object.keys(filteredHeaders).length > 0 ? filteredHeaders : undefined;
}
/**
* Gets the deployment name for a given model name from the Vertex AI configuration.
* Maps visible model names to actual deployment names (model IDs).
* @param modelName - The visible model name (e.g., "Claude Opus 4.5")
* @param vertexConfig - The Vertex AI configuration with model mappings
* @returns The deployment name to use with the API (e.g., "claude-opus-4-5@20251101")
*/
export function getVertexDeploymentName(
modelName: string,
vertexConfig?: VertexAIConfigInput,
): string {
if (!vertexConfig?.models) {
// No models configuration, return model name as-is
return modelName;
}
// If models is an array, check if modelName is in the array
if (Array.isArray(vertexConfig.models)) {
// Legacy format - no deployment mapping
return modelName;
}
// If models is an object, look up the deployment name
const modelConfig = vertexConfig.models[modelName];
if (!modelConfig) {
// Model not found in config, return as-is
return modelName;
}
if (typeof modelConfig === 'boolean') {
// Model is true/false - use default deployment name or model name
return vertexConfig.deploymentName || modelName;
}
// Model has its own deployment name
return modelConfig.deploymentName || vertexConfig.deploymentName || modelName;
}
/**
* Creates and configures a Vertex AI client for Anthropic.
* Supports both YAML configuration and environment variables for region/projectId.
* The projectId is automatically extracted from the service key if not explicitly provided.
* @param credentials - The Google service account credentials
* @param options - SDK client options
* @param vertexOptions - Vertex AI specific options (region, projectId) from YAML config
*/
export function createAnthropicVertexClient(
credentials: AnthropicCredentials,
options?: ClientOptions,
vertexOptions?: VertexAIClientOptions,
): AnthropicVertex {
const serviceKey = credentials[AuthKeys.GOOGLE_SERVICE_KEY];
if (!serviceKey) {
throw new Error('Google service account key is required for Vertex AI');
}
// Priority: vertexOptions > env vars > service key project_id
const region = vertexOptions?.region || process.env.ANTHROPIC_VERTEX_REGION || 'us-east5';
const projectId =
vertexOptions?.projectId || process.env.VERTEX_PROJECT_ID || serviceKey.project_id;
try {
const googleAuth = new GoogleAuth({
credentials: serviceKey,
scopes: 'https://www.googleapis.com/auth/cloud-platform',
...(projectId && { projectId }),
});
// Filter out unsupported anthropic-beta header values for Vertex AI
const filteredOptions = options
? {
...options,
defaultHeaders: filterVertexHeaders(
options.defaultHeaders as Record<string, string> | undefined,
),
}
: undefined;
return new AnthropicVertex({
region: region,
googleAuth: googleAuth,
...(projectId && { projectId }),
...filteredOptions,
});
} catch (error) {
const message = error instanceof Error ? error.message : String(error);
throw new Error(`Failed to create Vertex AI client: ${message}`);
}
}

View file

@ -340,8 +340,16 @@ export async function fetchAnthropicModels(
* @param opts - Options for fetching models
* @returns Promise resolving to array of model IDs
*/
export async function getAnthropicModels(opts: { user?: string } = {}): Promise<string[]> {
export async function getAnthropicModels(
opts: { user?: string; vertexModels?: string[] } = {},
): Promise<string[]> {
const models = defaultModels[EModelEndpoint.anthropic];
// Vertex AI models from YAML config take priority
if (opts.vertexModels && opts.vertexModels.length > 0) {
return opts.vertexModels;
}
if (process.env.ANTHROPIC_MODELS) {
return splitAndTrim(process.env.ANTHROPIC_MODELS);
}

View file

@ -1,11 +1,28 @@
import { z } from 'zod';
import { Dispatcher } from 'undici';
import { anthropicSchema } from 'librechat-data-provider';
import { AuthKeys, anthropicSchema, TVertexAISchema } from 'librechat-data-provider';
import type { AnthropicClientOptions } from '@librechat/agents';
import type { LLMConfigResult } from './openai';
import type { GoogleServiceKey } from '../utils/key';
export type AnthropicParameters = z.infer<typeof anthropicSchema>;
export type AnthropicCredentials = {
[AuthKeys.GOOGLE_SERVICE_KEY]?: GoogleServiceKey;
[AuthKeys.ANTHROPIC_API_KEY]?: string;
};
/**
* Vertex AI client options for configuring the Anthropic Vertex client.
* These options are typically loaded from the YAML config or environment variables.
*/
export interface VertexAIClientOptions {
/** Google Cloud region for Vertex AI (e.g., 'us-east5', 'europe-west1') */
region?: string;
/** Google Cloud Project ID */
projectId?: string;
}
export interface ThinkingConfigDisabled {
type: 'disabled';
}
@ -60,6 +77,10 @@ export interface AnthropicConfigOptions {
addParams?: Record<string, unknown>;
/** Parameters to drop/exclude from the configuration */
dropParams?: string[];
/** Vertex AI specific options for Google Cloud configuration */
vertexOptions?: VertexAIClientOptions;
/** Full Vertex AI configuration including model mappings from YAML config */
vertexConfig?: TVertexAISchema;
}
/**

View file

@ -359,6 +359,67 @@ export const azureEndpointSchema = z
export type TAzureConfig = Omit<z.infer<typeof azureEndpointSchema>, 'groups'> &
TAzureConfigValidationResult;
/**
* Vertex AI model configuration - similar to Azure model config
* Allows specifying deployment name for each model
*/
export const vertexModelConfigSchema = z
.object({
/** The actual model ID/deployment name used by Vertex AI API */
deploymentName: z.string().optional(),
})
.or(z.boolean());
export type TVertexModelConfig = z.infer<typeof vertexModelConfigSchema>;
/**
* Vertex AI configuration schema for Anthropic models served via Google Cloud Vertex AI.
* Similar to Azure configuration, this allows running Anthropic models through Google Cloud.
*/
export const vertexAISchema = z.object({
/** Enable Vertex AI mode for Anthropic (defaults to true when vertex config is present) */
enabled: z.boolean().optional(),
/** Google Cloud Project ID (optional - auto-detected from service key file if not provided) */
projectId: z.string().optional(),
/** Vertex AI region (e.g., 'us-east5', 'europe-west1') */
region: z.string().default('us-east5'),
/** Optional: Path to service account key file */
serviceKeyFile: z.string().optional(),
/** Optional: Default deployment name for all models (can be overridden per model) */
deploymentName: z.string().optional(),
/** Optional: Available models - can be string array or object with deploymentName mapping */
models: z.union([z.array(z.string()), z.record(z.string(), vertexModelConfigSchema)]).optional(),
});
export type TVertexAISchema = z.infer<typeof vertexAISchema>;
export type TVertexModelMap = Record<string, string>;
/**
* Validated Vertex AI configuration result
*/
export type TVertexAIConfig = TVertexAISchema & {
isValid: boolean;
errors: string[];
modelNames?: string[];
modelDeploymentMap?: TVertexModelMap;
};
/**
* Anthropic endpoint schema with optional Vertex AI configuration.
* Extends baseEndpointSchema with Vertex AI support.
*/
export const anthropicEndpointSchema = baseEndpointSchema.merge(
z.object({
/** Vertex AI configuration for running Anthropic models on Google Cloud */
vertex: vertexAISchema.optional(),
/** Optional: List of available models */
models: z.array(z.string()).optional(),
}),
);
export type TAnthropicEndpoint = z.infer<typeof anthropicEndpointSchema>;
const ttsOpenaiSchema = z.object({
url: z.string().optional(),
apiKey: z.string(),
@ -886,7 +947,7 @@ export const configSchema = z.object({
all: baseEndpointSchema.optional(),
[EModelEndpoint.openAI]: baseEndpointSchema.optional(),
[EModelEndpoint.google]: baseEndpointSchema.optional(),
[EModelEndpoint.anthropic]: baseEndpointSchema.optional(),
[EModelEndpoint.anthropic]: anthropicEndpointSchema.optional(),
[EModelEndpoint.azureOpenAI]: azureEndpointSchema.optional(),
[EModelEndpoint.azureAssistants]: assistantEndpointSchema.optional(),
[EModelEndpoint.assistants]: assistantEndpointSchema.optional(),
@ -1499,6 +1560,12 @@ export enum AuthKeys {
* Note: this is not for Environment Variables, but to access encrypted object values.
*/
GOOGLE_API_KEY = 'GOOGLE_API_KEY',
/**
* API key to use Anthropic.
*
* Note: this is not for Environment Variables, but to access encrypted object values.
*/
ANTHROPIC_API_KEY = 'ANTHROPIC_API_KEY',
}
/**

View file

@ -1,9 +1,10 @@
import { EModelEndpoint } from 'librechat-data-provider';
import type { TCustomConfig, TAgentsEndpoint } from 'librechat-data-provider';
import type { TCustomConfig, TAgentsEndpoint, TAnthropicEndpoint } from 'librechat-data-provider';
import type { AppConfig } from '~/types';
import { azureAssistantsDefaults, assistantsConfigSetup } from './assistants';
import { agentsConfigSetup } from './agents';
import { azureConfigSetup } from './azure';
import { vertexConfigSetup } from './vertex';
/**
* Loads custom config endpoints
@ -43,12 +44,26 @@ export const loadEndpoints = (
loadedEndpoints[EModelEndpoint.agents] = agentsConfigSetup(config, agentsDefaults);
// Handle Anthropic endpoint with Vertex AI configuration
if (endpoints?.[EModelEndpoint.anthropic]) {
const anthropicConfig = endpoints[EModelEndpoint.anthropic] as TAnthropicEndpoint;
const vertexConfig = vertexConfigSetup(config);
loadedEndpoints[EModelEndpoint.anthropic] = {
...anthropicConfig,
// If Vertex AI is enabled, use the visible model names from vertex config
// Otherwise, use the models array from anthropic config
...(vertexConfig?.modelNames && { models: vertexConfig.modelNames }),
// Attach validated Vertex AI config if present
...(vertexConfig && { vertexConfig }),
};
}
const endpointKeys = [
EModelEndpoint.openAI,
EModelEndpoint.google,
EModelEndpoint.custom,
EModelEndpoint.bedrock,
EModelEndpoint.anthropic,
];
endpointKeys.forEach((key) => {

View file

@ -3,4 +3,5 @@ export * from './interface';
export * from './service';
export * from './specs';
export * from './turnstile';
export * from './vertex';
export * from './web';

View file

@ -0,0 +1,198 @@
import logger from '~/config/winston';
import {
EModelEndpoint,
extractEnvVariable,
envVarRegex,
TVertexModelMap,
} from 'librechat-data-provider';
import type {
TCustomConfig,
TVertexAISchema,
TVertexAIConfig,
TAnthropicEndpoint,
TVertexModelConfig,
} from 'librechat-data-provider';
/**
* Default Vertex AI models available through Google Cloud
* These are the standard Anthropic model names as served by Vertex AI
*/
export const defaultVertexModels = [
'claude-sonnet-4-20250514',
'claude-3-7-sonnet-20250219',
'claude-3-5-sonnet-v2@20241022',
'claude-3-5-sonnet@20240620',
'claude-3-5-haiku@20241022',
'claude-3-opus@20240229',
'claude-3-haiku@20240307',
];
/**
* Processes models configuration and creates deployment name mapping
* Similar to Azure's model mapping logic
* @param models - The models configuration (can be array or object)
* @param defaultDeploymentName - Optional default deployment name
* @returns Object containing modelNames array and modelDeploymentMap
*/
function processVertexModels(
models: string[] | Record<string, TVertexModelConfig> | undefined,
defaultDeploymentName?: string,
): { modelNames: string[]; modelDeploymentMap: TVertexModelMap } {
const modelNames: string[] = [];
const modelDeploymentMap: TVertexModelMap = {};
if (!models) {
// No models specified, use defaults
for (const model of defaultVertexModels) {
modelNames.push(model);
modelDeploymentMap[model] = model; // Default: model name = deployment name
}
return { modelNames, modelDeploymentMap };
}
if (Array.isArray(models)) {
// Legacy format: simple array of model names
for (const modelName of models) {
modelNames.push(modelName);
// If a default deployment name is provided, use it for all models
// Otherwise, model name is the deployment name
modelDeploymentMap[modelName] = defaultDeploymentName || modelName;
}
} else {
// New format: object with model names as keys and config as values
for (const [modelName, modelConfig] of Object.entries(models)) {
modelNames.push(modelName);
if (typeof modelConfig === 'boolean') {
// Model is set to true/false - use default deployment name or model name
modelDeploymentMap[modelName] = defaultDeploymentName || modelName;
} else if (modelConfig?.deploymentName) {
// Model has its own deployment name specified
modelDeploymentMap[modelName] = modelConfig.deploymentName;
} else {
// Model is an object but no deployment name - use default or model name
modelDeploymentMap[modelName] = defaultDeploymentName || modelName;
}
}
}
return { modelNames, modelDeploymentMap };
}
/**
* Validates and processes Vertex AI configuration
* @param vertexConfig - The Vertex AI configuration object
* @returns Validated configuration with errors if any
*/
export function validateVertexConfig(
vertexConfig: TVertexAISchema | undefined,
): TVertexAIConfig | null {
if (!vertexConfig) {
return null;
}
const errors: string[] = [];
// Extract and validate environment variables
// projectId is optional - will be auto-detected from service key if not provided
const projectId = vertexConfig.projectId ? extractEnvVariable(vertexConfig.projectId) : undefined;
const region = extractEnvVariable(vertexConfig.region || 'us-east5');
const serviceKeyFile = vertexConfig.serviceKeyFile
? extractEnvVariable(vertexConfig.serviceKeyFile)
: undefined;
const defaultDeploymentName = vertexConfig.deploymentName
? extractEnvVariable(vertexConfig.deploymentName)
: undefined;
// Check for unresolved environment variables
if (projectId && envVarRegex.test(projectId)) {
errors.push(
`Vertex AI projectId environment variable "${vertexConfig.projectId}" was not found.`,
);
}
if (envVarRegex.test(region)) {
errors.push(`Vertex AI region environment variable "${vertexConfig.region}" was not found.`);
}
if (serviceKeyFile && envVarRegex.test(serviceKeyFile)) {
errors.push(
`Vertex AI serviceKeyFile environment variable "${vertexConfig.serviceKeyFile}" was not found.`,
);
}
if (defaultDeploymentName && envVarRegex.test(defaultDeploymentName)) {
errors.push(
`Vertex AI deploymentName environment variable "${vertexConfig.deploymentName}" was not found.`,
);
}
// Process models and create deployment mapping
const { modelNames, modelDeploymentMap } = processVertexModels(
vertexConfig.models,
defaultDeploymentName,
);
// Note: projectId is optional - if not provided, it will be auto-detected from the service key file
const isValid = errors.length === 0;
return {
enabled: vertexConfig.enabled !== false,
projectId,
region,
serviceKeyFile,
deploymentName: defaultDeploymentName,
models: vertexConfig.models,
modelNames,
modelDeploymentMap,
isValid,
errors,
};
}
/**
* Sets up the Vertex AI configuration from the config (`librechat.yaml`) file.
* Similar to azureConfigSetup, this processes and validates the Vertex AI configuration.
* @param config - The loaded custom configuration.
* @returns The validated Vertex AI configuration or null if not configured.
*/
export function vertexConfigSetup(config: Partial<TCustomConfig>): TVertexAIConfig | null {
const anthropicConfig = config.endpoints?.[EModelEndpoint.anthropic] as
| TAnthropicEndpoint
| undefined;
if (!anthropicConfig?.vertex) {
return null;
}
const vertexConfig = anthropicConfig.vertex;
// Skip if explicitly disabled (enabled: false)
// When vertex config exists, it's enabled by default unless explicitly set to false
if (vertexConfig.enabled === false) {
return null;
}
const validatedConfig = validateVertexConfig(vertexConfig);
if (!validatedConfig) {
return null;
}
if (!validatedConfig.isValid) {
const errorString = validatedConfig.errors.join('\n');
const errorMessage = 'Invalid Vertex AI configuration:\n' + errorString;
logger.error(errorMessage);
throw new Error(errorMessage);
}
logger.info('Vertex AI configuration loaded successfully', {
projectId: validatedConfig.projectId,
region: validatedConfig.region,
modelCount: validatedConfig.modelNames?.length || 0,
models: validatedConfig.modelNames,
});
return validatedConfig;
}

View file

@ -6,9 +6,11 @@ import type {
TCustomConfig,
TMemoryConfig,
EModelEndpoint,
TVertexAIConfig,
TAgentsEndpoint,
TCustomEndpoints,
TAssistantEndpoint,
TAnthropicEndpoint,
} from 'librechat-data-provider';
export type JsonSchemaType = {
@ -99,8 +101,11 @@ export interface AppConfig {
google?: Partial<TEndpoint>;
/** Bedrock endpoint configuration */
bedrock?: Partial<TEndpoint>;
/** Anthropic endpoint configuration */
anthropic?: Partial<TEndpoint>;
/** Anthropic endpoint configuration with optional Vertex AI support */
anthropic?: Partial<TAnthropicEndpoint> & {
/** Validated Vertex AI configuration */
vertexConfig?: TVertexAIConfig;
};
/** Azure OpenAI endpoint configuration */
azureOpenAI?: TAzureConfig;
/** Assistants endpoint configuration */