Commit graph

12 commits

Author SHA1 Message Date
Danny Avila
8318446704
💁 refactor: Better Config UX for MCP STDIO with customUserVars (#12226)
* refactor: Better UX for MCP stdio with Custom User Variables

- Updated the ConnectionsRepository to prevent connections when customUserVars are defined, improving security and access control.
- Modified the MCPServerInspector to skip capabilities fetch when customUserVars are present, streamlining server inspection.
- Added tests to validate connection restrictions with customUserVars, ensuring robust handling of various server configurations.

This change enhances the overall integrity of the connection management process by enforcing stricter rules around custom user variables.

* fix: guard against empty customUserVars and add JSDoc context

- Extract `hasCustomUserVars()` helper to guard against truthy `{}`
  (Zod's `.record().optional()` yields `{}` on empty input, not `undefined`)
- Add JSDoc to `isAllowedToConnectToServer` explaining why customUserVars
  servers are excluded from app-level connections

* test: improve customUserVars test coverage and fixture hygiene

- Add no-connection-provided test for MCPServerInspector (production path)
- Fix test descriptions to match actual fixture values
- Replace real package name with fictional @test/mcp-stdio-server
2026-03-14 21:22:25 -04:00
Danny Avila
32cadb1cc5
🩹 fix: MCP Server Recovery from Startup Inspection Failures (#12145)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions
* feat: MCP server reinitialization recovery mechanism

- Added functionality to store a stub configuration for MCP servers that fail inspection at startup, allowing for recovery via reinitialization.
- Introduced `reinspectServer` method in `MCPServersRegistry` to handle reinspection of previously failed servers.
- Enhanced `MCPServersInitializer` to log and manage server initialization failures, ensuring proper handling of inspection failures.
- Added integration tests to verify the recovery process for unreachable MCP servers, ensuring that stub configurations are stored and can be reinitialized successfully.
- Updated type definitions to include `inspectionFailed` flag in server configurations for better state management.

* fix: MCP server handling for inspection failures

- Updated `reinitMCPServer` to return a structured response when the server is unreachable, providing clearer feedback on the failure.
- Modified `ConnectionsRepository` to prevent connections to servers marked as inspection failed, improving error handling.
- Adjusted `MCPServersRegistry` methods to ensure proper management of server states, including throwing errors for non-failed servers during reinspection.
- Enhanced integration tests to validate the behavior of the system when dealing with unreachable MCP servers and inspection failures, ensuring robust recovery mechanisms.

* fix: Clear all cached server configurations in MCPServersRegistry

- Added a comment to clarify the necessity of clearing all cached server configurations when updating a server's configuration, as the cache is keyed by userId without a reverse index for enumeration.

* fix: Update integration test for file_tools_server inspection handling

- Modified the test to verify that the `file_tools_server` is stored as a stub when inspection fails, ensuring it can be reinitialized correctly.
- Adjusted expectations to confirm that the `inspectionFailed` flag is set to true for the stub configuration, enhancing the robustness of the recovery mechanism.

* test: Add unit tests for reinspecting servers in MCPServersRegistry

- Introduced tests for the `reinspectServer` method to validate error handling when called on a healthy server and when the server does not exist.
- Ensured that appropriate exceptions are thrown for both scenarios, enhancing the robustness of server state management.

* test: Add integration test for concurrent reinspectServer calls

- Introduced a new test to validate that multiple concurrent calls to reinspectServer do not crash or corrupt the server state.
- Ensured that at least one call succeeds and any failures are due to the server not being in a failed state, enhancing the reliability of the reinitialization process.

* test: Enhance integration test for concurrent MCP server reinitialization

- Added a new test to validate that concurrent calls to reinitialize the MCP server do not crash or corrupt the server state.
- Ensured that at least one call succeeds and that failures are handled gracefully, improving the reliability of the reinitialization process.
- Reset MCPManager instance after each test to maintain a clean state for subsequent tests.
2026-03-08 21:49:04 -04:00
Danny Avila
d3c06052d7
🗝️ feat: Credential Variables for DB-Sourced MCP Servers (#12044)
* feat: Allow Credential Variables in Headers for DB-sourced MCP Servers

- Removed the hasCustomUserVars check from ToolService.js, directly retrieving userMCPAuthMap.
- Updated MCPConnectionFactory and related classes to include a dbSourced flag for better handling of database-sourced configurations.
- Added integration tests to ensure proper behavior of dbSourced servers, verifying that sensitive placeholders are not resolved while allowing customUserVars.
- Adjusted various MCP-related files to accommodate the new dbSourced logic, ensuring consistent handling across the codebase.

* chore: MCPConnectionFactory Tests with Additional Flow Metadata for typing

- Updated MCPConnectionFactory tests to include new fields in flowMetadata: serverUrl and state.
- Enhanced mockFlowData in multiple test cases to reflect the updated structure, ensuring comprehensive coverage of the OAuth flow scenarios.
- Added authorization_endpoint to metadata in the test setup for improved validation of the OAuth process.

* refactor: Simplify MCPManager Configuration Handling

- Removed unnecessary type assertions and streamlined the retrieval of server configuration in MCPManager.
- Enhanced the handling of OAuth and database-sourced flags for improved clarity and efficiency.
- Updated tests to reflect changes in user object structure and ensure proper processing of MCP environment variables.

* refactor: Optimize User MCP Auth Map Retrieval in ToolService

- Introduced conditional loading of userMCPAuthMap based on the presence of MCP-delimited tools, improving efficiency by avoiding unnecessary calls.
- Updated the loadToolDefinitionsWrapper and loadAgentTools functions to reflect this change, enhancing overall performance and clarity.

* test: Add userMCPAuthMap gating tests in ToolService

- Introduced new tests to validate the logic for determining if MCP tools are present in the agent's tool list.
- Implemented various scenarios to ensure accurate detection of MCP tools, including edge cases for empty, undefined, and null tool lists.
- Enhanced clarity and coverage of the ToolService capability checking logic.

* refactor: Enhance MCP Environment Variable Processing

- Simplified the handling of the dbSourced parameter in the processMCPEnv function.
- Introduced a failsafe mechanism to derive dbSourced from options if not explicitly provided, improving robustness and clarity in MCP environment variable processing.

* refactor: Update Regex Patterns for Credential Placeholders in ServerConfigsDB

- Modified regex patterns to include additional credential/env placeholders that should not be allowed in user-provided configurations.
- Clarified comments to emphasize the security risks associated with credential exfiltration when MCP servers are shared between users.

* chore: field order

* refactor: Clean Up dbSourced Parameter Handling in processMCPEnv

- Reintroduced the failsafe mechanism for deriving the dbSourced parameter from options, ensuring clarity and robustness in MCP environment variable processing.
- Enhanced code readability by maintaining consistent comment structure.

* refactor: Update MCPOptions Type to Include Optional dbId

- Modified the processMCPEnv function to extend the MCPOptions type, allowing for an optional dbId property.
- Simplified the logic for deriving the dbSourced parameter by directly checking the dbId property, enhancing code clarity and maintainability.
2026-03-03 18:02:37 -05:00
Danny Avila
a0f9782e60
🪣 fix: Prevent Memory Retention from AsyncLocalStorage Context Propagation (#11942)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
* fix: store hide_sequential_outputs before processStream clears config

processStream now clears config.configurable after completion to break
memory retention chains. Save hide_sequential_outputs to a local
variable before calling runAgents so the post-stream filter still works.

* feat: memory diagnostics

* chore: expose garbage collection in backend inspect command

Updated the backend inspect command in package.json to include the --expose-gc flag, enabling garbage collection diagnostics for improved memory management during development.

* chore: update @librechat/agents dependency to version 3.1.52

Bumped the version of @librechat/agents in package.json and package-lock.json to ensure compatibility and access to the latest features and fixes.

* fix: clear heavy config state after processStream to prevent memory leaks

Break the reference chain from LangGraph's internal __pregel_scratchpad
through @langchain/core RunTree.extra[lc:child_config] into the
AsyncLocalStorage context captured by timers and I/O handles.

After stream completion, null out symbol-keyed scratchpad properties
(currentTaskInput), config.configurable, and callbacks. Also call
Graph.clearHeavyState() to release config, signal, content maps,
handler registry, and tool sessions.

* chore: fix imports for memory utils

* chore: add circular dependency check in API build step

Enhanced the backend review workflow to include a check for circular dependencies during the API build process. If a circular dependency is detected, an error message is displayed, and the process exits with a failure status.

* chore: update API build step to include circular dependency detection

Modified the backend review workflow to rename the API package installation step to reflect its new functionality, which now includes detection of circular dependencies during the build process.

* chore: add memory diagnostics option to .env.example

Included a commented-out configuration option for enabling memory diagnostics in the .env.example file, which logs heap and RSS snapshots every 60 seconds when activated.

* chore: remove redundant agentContexts cleanup in disposeClient function

Streamlined the disposeClient function by eliminating duplicate cleanup logic for agentContexts, ensuring efficient memory management during client disposal.

* refactor: move runOutsideTracing utility to utils and update its usage

Refactored the runOutsideTracing function by relocating it to the utils module for better organization. Updated the tool execution handler to utilize the new import, ensuring consistent tracing behavior during tool execution.

* refactor: enhance connection management and diagnostics

Added a method to ConnectionsRepository for retrieving the active connection count. Updated UserConnectionManager to utilize this new method for app connection count reporting. Refined the OAuthReconnectionTracker's getStats method to improve clarity in diagnostics. Introduced a new tracing utility in the utils module to streamline tracing context management. Additionally, added a safeguard in memory diagnostics to prevent unnecessary snapshot collection for very short intervals.

* refactor: enhance tracing utility and add memory diagnostics tests

Refactored the runOutsideTracing function to improve warning logic when the AsyncLocalStorage context is missing. Added tests for memory diagnostics and tracing utilities to ensure proper functionality and error handling. Introduced a new test suite for memory diagnostics, covering snapshot collection and garbage collection behavior.
2026-02-25 17:41:23 -05:00
Danny Avila
3bf715e05e
♻️ refactor: On-demand MCP connections: remove proactive reconnect, default to available (#11839)
* feat: Implement reconnection staggering and backoff jitter for MCP connections

- Enhanced the reconnection logic in OAuthReconnectionManager to stagger reconnection attempts for multiple servers, reducing the risk of connection storms.
- Introduced a backoff delay with random jitter in MCPConnection to improve reconnection behavior during network issues.
- Updated the ConnectionsRepository to handle multiple server connections concurrently with a defined concurrency limit.

Added tests to ensure the new reconnection strategy works as intended.

* refactor: Update MCP server query configuration for improved data freshness

- Reduced stale time from 5 minutes to 30 seconds to ensure quicker updates on server initialization.
- Enabled refetching on window focus and mount to enhance data accuracy during user interactions.

* ♻️  refactor: On-demand MCP connections; remove proactive reconnection, default to available

  - Remove reconnectServers() from refresh controller (connection storm root cause)
  - Stop gating server selection on connection status; add to selection immediately
  - Render agent panel tools from DB cache, not live connection status
  - Proceed to cached tools on init failure (only gate on OAuth)
  - Remove unused batchToggleServers()
  - Reduce useMCPServersQuery staleTime from 5min to 30s, enable refetchOnMount/WindowFocus

* refactor: Optimize MCP tool initialization and server connection logic

- Adjusted tool initialization to only occur if no cached tools are available, improving efficiency.
- Updated comments for clarity on server connection and tool fetching processes.
- Removed unnecessary connection status checks during server selection to streamline the user experience.
2026-02-17 22:33:57 -05:00
Danny Avila
924be3b647
🛡️ fix: Implement TOCTOU-Safe SSRF Protection for Actions and MCP (#11722)
* refactor: better SSRF Protection in Action and Tool Services

- Added `createSSRFSafeAgents` function to create HTTP/HTTPS agents that block connections to private/reserved IP addresses, enhancing security against SSRF attacks.
- Updated `createActionTool` to accept a `useSSRFProtection` parameter, allowing the use of SSRF-safe agents during tool execution.
- Modified `processRequiredActions` and `loadAgentTools` to utilize the new SSRF protection feature based on allowed domains configuration.
- Introduced `resolveHostnameSSRF` function to validate resolved IPs against private ranges, preventing potential SSRF vulnerabilities.
- Enhanced tests for domain resolution and private IP detection to ensure robust SSRF protection mechanisms are in place.

* feat: Implement SSRF protection in MCP connections

- Added `createSSRFSafeUndiciConnect` function to provide SSRF-safe DNS lookup options for undici agents.
- Updated `MCPConnection`, `MCPConnectionFactory`, and `ConnectionsRepository` to include `useSSRFProtection` parameter, enabling SSRF protection based on server configuration.
- Enhanced `MCPManager` and `UserConnectionManager` to utilize SSRF protection when establishing connections.
- Updated tests to validate the integration of SSRF protection across various components, ensuring robust security measures are in place.

* refactor: WS MCPConnection with SSRF protection and async transport construction

- Added `resolveHostnameSSRF` to validate WebSocket hostnames against private IP addresses, enhancing SSRF protection.
- Updated `constructTransport` method to be asynchronous, ensuring proper handling of SSRF checks before establishing connections.
- Improved error handling for WebSocket transport to prevent connections to potentially unsafe addresses.

* test: Enhance ActionRequest tests for SSRF-safe agent passthrough

- Added tests to verify that httpAgent and httpsAgent are correctly passed to axios.create when provided in ActionRequest.
- Included scenarios to ensure agents are not included when no options are specified.
- Enhanced coverage for POST requests to confirm agent passthrough functionality.
- Improved overall test robustness for SSRF protection in ActionRequest execution.
2026-02-11 22:09:58 -05:00
Atef Bellaaj
99f8bd2ce6
🏗️ feat: Dynamic MCP Server Infrastructure with Access Control (#10787)
* Feature: Dynamic MCP Server with Full UI Management

* 🚦 feat: Add MCP Connection Status icons to MCPBuilder panel (#10805)

* feature: Add MCP server connection status icons to MCPBuilder panel

* refactor: Simplify MCPConfigDialog rendering in MCPBuilderPanel

---------

Co-authored-by: Atef Bellaaj <slalom.bellaaj@external.daimlertruck.com>
Co-authored-by: Danny Avila <danny@librechat.ai>

* fix: address code review feedback for MCP server management

- Fix OAuth secret preservation to avoid mutating input parameter
  by creating a merged config copy in ServerConfigsDB.update()

- Improve error handling in getResourcePermissionsMap to propagate
  critical errors instead of silently returning empty Map

- Extract duplicated MCP server filter logic by exposing selectableServers
  from useMCPServerManager hook and using it in MCPSelect component

* test: Update PermissionService tests to throw errors on invalid resource types

- Changed the test for handling invalid resource types to ensure it throws an error instead of returning an empty permissions map.
- Updated the expectation to check for the specific error message when an invalid resource type is provided.

* feat: Implement retry logic for MCP server creation to handle race conditions

- Enhanced the createMCPServer method to include retry logic with exponential backoff for handling duplicate key errors during concurrent server creation.
- Updated tests to verify that all concurrent requests succeed and that unique server names are generated.
- Added a helper function to identify MongoDB duplicate key errors, improving error handling during server creation.

* refactor: StatusIcon to use CircleCheck for connected status

- Replaced the PlugZap icon with CircleCheck in the ConnectedStatusIcon component to better represent the connected state.
- Ensured consistent icon usage across the component for improved visual clarity.

* test: Update AccessControlService tests to throw errors on invalid resource types

- Modified the test for invalid resource types to ensure it throws an error with a specific message instead of returning an empty permissions map.
- This change enhances error handling and improves test coverage for the AccessControlService.

* fix: Update error message for missing server name in MCP server retrieval

- Changed the error message returned when the server name is not provided from 'MCP ID is required' to 'Server name is required' for better clarity and accuracy in the API response.

---------

Co-authored-by: Atef Bellaaj <slalom.bellaaj@external.daimlertruck.com>
Co-authored-by: Danny Avila <danny@librechat.ai>
2025-12-11 16:38:37 -05:00
Atef Bellaaj
ad6ba4b6d1
🧬 refactor: Wire Database Methods into MCP Package via Registry Pattern (#10715)
* Refactor: MCPServersRegistry Singleton Pattern with Dependency Injection for DB methods consumption

* refactor: error handling in MCP initialization and improve logging for MCPServersRegistry instance creation.

- Added checks for mongoose instance in ServerConfigsDB constructor and refined error messages for clarity.
- Reorder and use type imports

---------

Co-authored-by: Atef Bellaaj <slalom.bellaaj@external.daimlertruck.com>
Co-authored-by: Danny Avila <danny@librechat.ai>
2025-12-11 16:37:12 -05:00
Atef Bellaaj
da473bf43a
🗃️ refactor: Simplify MCP Server Config to Two-Repository Pattern (#10705)
* refactor(mcp): simplify registry to two-repository architecture with explicit storage

* Chore: address AI Review comments

* Simplify MCP config cache architecture and remove legacy code:
Follow-up cleanup to commit d2bfdd033 which refactored MCP registry to two-repository architecture. This removes leftover legacy abstractions that were no longer used.
 What changed:
  - Simplified ServerConfigsCacheFactory.create() from 3 params to 2 (namespace, leaderOnly)
  - Removed unused scope: 'Shared' | 'Private' parameter (only 'Shared' was ever used)
  - Removed dead set() and getNamespace() methods from cache classes
  - Updated JSDoc to reflect two-repository architecture (Cache + DB) instead of old three-tier system
  - Fixed stale mocks and comments referencing removed sharedAppServers, sharedUserServers, privateServersCache

  Files changed:
  - ServerConfigsCacheFactory.ts - Simplified factory signature
  - ServerConfigsCacheRedis.ts - Removed scope, renamed owner→namespace
  - ServerConfigsCacheInMemory.ts - Removed unused methods
  - MCPServersRegistry.ts - Updated JSDoc, simplified factory call
  - RegistryStatusCache.ts - Removed stale JSDoc reference
  - MCPManager.test.ts - Fixed legacy mock
  - ServerConfigsCacheFactory.test.ts - Updated test assertions

* fix: Update error message in MCPServersRegistry for clarity

---------

Co-authored-by: Atef Bellaaj <slalom.bellaaj@external.daimlertruck.com>
Co-authored-by: Danny Avila <danny@librechat.ai>
2025-12-11 16:37:12 -05:00
Atef Bellaaj
ac68e629e6
📡 refactor: MCP Runtime Config Sync with Redis Distributed Locking (#10352)
* 🔄 Refactoring: MCP Runtime Configuration Reload
 - PrivateServerConfigs own cache classes (inMemory and Redis).
 - Connections staleness detection by comparing (connection.createdAt and config.LastUpdatedAt)
 - ConnectionsRepo access Registry instead of in memory config dict and renew stale connections
 - MCPManager: adjusted init of ConnectionsRepo (app level)
 - UserConnectionManager: renew stale connections
 - skipped test, to test "should only clear keys in its own namespace"
 - MCPPrivateServerLoader: new component to manage logic of loading / editing private servers on runtime
 - PrivateServersLoadStatusCache to track private server cache status
 - New unit and integration tests.
Misc:
 - add es lint rule to enforce line between class methods

* Fix cluster mode batch update and delete workarround. Fixed unit tests for cluster mode.

* Fix Keyv redis clear cache namespace  awareness issue + Integration tests fixes

* chore: address copilot comments

* Fixing rebase issue: removed the mcp config fallback in single getServerConfig method:
- to not to interfere with the logic of the right Tier (APP/USER/Private)
- If userId is null, the getServerConfig should not return configs that are a SharedUser tier and not APP tier

* chore: add dev-staging branch to workflow triggers for backend, cache integration, and ESLint checks

---------

Co-authored-by: Atef Bellaaj <slalom.bellaaj@external.daimlertruck.com>
2025-12-11 16:36:15 -05:00
Danny Avila
d7d02766ea
🏷️ feat: Request Placeholders for Custom Endpoint & MCP Headers (#9095)
* feat: Add conversation ID support to custom endpoint headers

- Add LIBRECHAT_CONVERSATION_ID to customUserVars when provided
- Pass conversation ID to header resolution for dynamic headers
- Add comprehensive test coverage

Enables custom endpoints to access conversation context using {{LIBRECHAT_CONVERSATION_ID}} placeholder.

* fix: filter out unresolved placeholders from headers (thanks @MrunmayS)

* feat: add support for request body placeholders in custom endpoint headers

- Add {{LIBRECHAT_BODY_*}} placeholders for conversationId, parentMessageId, messageId
- Update tests to reflect new body placeholder functionality

* refactor resolveHeaders

* style: minor styling cleanup

* fix: type error in unit test

* feat: add body to other endpoints

* feat: add body for mcp tool calls

* chore: remove changes that unnecessarily increase scope after clarification of requirements

* refactor: move http.ts to packages/api and have RequestBody intersect with Express request body

* refactor: processMCPEnv now uses single object argument pattern

* refactor: update processMCPEnv to use 'options' parameter and align types across MCP connection classes

* feat: enhance MCP connection handling with dynamic request headers to pass request body fields

---------

Co-authored-by: Gopal Sharma <gopalsharma@gopal.sharma1>
Co-authored-by: s10gopal <36487439+s10gopal@users.noreply.github.com>
Co-authored-by: Dustin Healy <dustinhealy1@gmail.com>
2025-08-16 20:45:55 -04:00
Theo N. Truong
8780a78165
♻️ refactor: MCPManager for Scalability, Fix App-Level Detection, Add Lazy Connections (#8930)
* feat: MCP Connection management overhaul - Making MCPManager manageable

Refactor the monolithic MCPManager into focused, single-responsibility classes:

• MCPServersRegistry: Server configuration discovery and metadata management
• UserConnectionManager: Manages user-level connections
• ConnectionsRepository: Low-level connection pool with lazy loading
• MCPConnectionFactory: Handles MCP connection creation with OAuth support

New Features:
• Lazy loading of app-level connections for horizontal scaling
• Automatic reconnection for app-level connections
• Enhanced OAuth detection with explicit requiresOAuth flag
• Centralized MCP configuration management

Bug Fixes:
• App-level connection detection in MCPManager.callTool
• MCP Connection Reinitialization route behavior

Optimizations:
• MCPConnection.isConnected() caching to reduce overhead
• Concurrent server metadata retrieval instead of sequential

This refactoring addresses scalability bottlenecks and improves reliability
while maintaining backward compatibility with existing configurations.

* feat: Enabled import order in eslint.

* # Moved tests to __tests__ folder
# added tests for MCPServersRegistry.ts

* # Add unit tests for ConnectionsRepository functionality

* # Add unit tests for MCPConnectionFactory functionality

* # Reorganize MCP connection tests and improve error handling

* # reordering imports

* # Update testPathIgnorePatterns in jest.config.mjs to exclude development TypeScript files

* # removed mcp/manager.ts
2025-08-13 11:45:06 -04:00