* fix: store hide_sequential_outputs before processStream clears config
processStream now clears config.configurable after completion to break
memory retention chains. Save hide_sequential_outputs to a local
variable before calling runAgents so the post-stream filter still works.
* feat: memory diagnostics
* chore: expose garbage collection in backend inspect command
Updated the backend inspect command in package.json to include the --expose-gc flag, enabling garbage collection diagnostics for improved memory management during development.
* chore: update @librechat/agents dependency to version 3.1.52
Bumped the version of @librechat/agents in package.json and package-lock.json to ensure compatibility and access to the latest features and fixes.
* fix: clear heavy config state after processStream to prevent memory leaks
Break the reference chain from LangGraph's internal __pregel_scratchpad
through @langchain/core RunTree.extra[lc:child_config] into the
AsyncLocalStorage context captured by timers and I/O handles.
After stream completion, null out symbol-keyed scratchpad properties
(currentTaskInput), config.configurable, and callbacks. Also call
Graph.clearHeavyState() to release config, signal, content maps,
handler registry, and tool sessions.
* chore: fix imports for memory utils
* chore: add circular dependency check in API build step
Enhanced the backend review workflow to include a check for circular dependencies during the API build process. If a circular dependency is detected, an error message is displayed, and the process exits with a failure status.
* chore: update API build step to include circular dependency detection
Modified the backend review workflow to rename the API package installation step to reflect its new functionality, which now includes detection of circular dependencies during the build process.
* chore: add memory diagnostics option to .env.example
Included a commented-out configuration option for enabling memory diagnostics in the .env.example file, which logs heap and RSS snapshots every 60 seconds when activated.
* chore: remove redundant agentContexts cleanup in disposeClient function
Streamlined the disposeClient function by eliminating duplicate cleanup logic for agentContexts, ensuring efficient memory management during client disposal.
* refactor: move runOutsideTracing utility to utils and update its usage
Refactored the runOutsideTracing function by relocating it to the utils module for better organization. Updated the tool execution handler to utilize the new import, ensuring consistent tracing behavior during tool execution.
* refactor: enhance connection management and diagnostics
Added a method to ConnectionsRepository for retrieving the active connection count. Updated UserConnectionManager to utilize this new method for app connection count reporting. Refined the OAuthReconnectionTracker's getStats method to improve clarity in diagnostics. Introduced a new tracing utility in the utils module to streamline tracing context management. Additionally, added a safeguard in memory diagnostics to prevent unnecessary snapshot collection for very short intervals.
* refactor: enhance tracing utility and add memory diagnostics tests
Refactored the runOutsideTracing function to improve warning logic when the AsyncLocalStorage context is missing. Added tests for memory diagnostics and tracing utilities to ensure proper functionality and error handling. Introduced a new test suite for memory diagnostics, covering snapshot collection and garbage collection behavior.
* feat: Implement reconnection staggering and backoff jitter for MCP connections
- Enhanced the reconnection logic in OAuthReconnectionManager to stagger reconnection attempts for multiple servers, reducing the risk of connection storms.
- Introduced a backoff delay with random jitter in MCPConnection to improve reconnection behavior during network issues.
- Updated the ConnectionsRepository to handle multiple server connections concurrently with a defined concurrency limit.
Added tests to ensure the new reconnection strategy works as intended.
* refactor: Update MCP server query configuration for improved data freshness
- Reduced stale time from 5 minutes to 30 seconds to ensure quicker updates on server initialization.
- Enabled refetching on window focus and mount to enhance data accuracy during user interactions.
* ♻️ refactor: On-demand MCP connections; remove proactive reconnection, default to available
- Remove reconnectServers() from refresh controller (connection storm root cause)
- Stop gating server selection on connection status; add to selection immediately
- Render agent panel tools from DB cache, not live connection status
- Proceed to cached tools on init failure (only gate on OAuth)
- Remove unused batchToggleServers()
- Reduce useMCPServersQuery staleTime from 5min to 30s, enable refetchOnMount/WindowFocus
* refactor: Optimize MCP tool initialization and server connection logic
- Adjusted tool initialization to only occur if no cached tools are available, improving efficiency.
- Updated comments for clarity on server connection and tool fetching processes.
- Removed unnecessary connection status checks during server selection to streamline the user experience.
* Refactor: MCPServersRegistry Singleton Pattern with Dependency Injection for DB methods consumption
* refactor: error handling in MCP initialization and improve logging for MCPServersRegistry instance creation.
- Added checks for mongoose instance in ServerConfigsDB constructor and refined error messages for clarity.
- Reorder and use type imports
---------
Co-authored-by: Atef Bellaaj <slalom.bellaaj@external.daimlertruck.com>
Co-authored-by: Danny Avila <danny@librechat.ai>
* fix: update @librechat/agents dependency to version 3.0.29
* chore: fix typing by replacing TUser with IUser
* chore: import order
* fix: replace TUser with IUser in run and OAuthReconnectionManager modules
* fix: update @librechat/agents dependency to version 3.0.30
* refactor: Restructure MCP registry system with caching
- Split MCPServersRegistry into modular components:
- MCPServerInspector: handles server inspection and health checks
- MCPServersInitializer: manages server initialization logic
- MCPServersRegistry: simplified registry coordination
- Add distributed caching layer:
- ServerConfigsCacheRedis: Redis-backed configuration cache
- ServerConfigsCacheInMemory: in-memory fallback cache
- RegistryStatusCache: distributed leader election state
- Add promise utilities (withTimeout) replacing Promise.race patterns
- Add comprehensive cache integration tests for all cache implementations
- Remove unused MCPManager.getAllToolFunctions method
* fix: Update OAuth flow to include user-specific headers
* chore: Update Jest configuration to ignore additional test files
- Added patterns to ignore files ending with .helper.ts and .helper.d.ts in testPathIgnorePatterns for cleaner test runs.
* fix: oauth headers in callback
* chore: Update Jest testPathIgnorePatterns to exclude helper files
- Modified testPathIgnorePatterns in package.json to ignore files ending with .helper.ts and .helper.d.ts for cleaner test execution.
* ci: update test mocks
---------
Co-authored-by: Danny Avila <danny@librechat.ai>
* initialize servers sequentially
* adjust for exported properties that are not nullable anymore
* use underscore separator
* mock with set
* customize init timeout via env var
* refactor for readability, use loaded conns for tool functions
* address PR comments
* clean up fire-and-forget
* fix tests
* initialize servers sequentially
* adjust for exported properties that are not nullable anymore
* use underscore separator
* mock with set
* customize init timeout via env var
* refactor: Implement gradual backoff polling for oauth connection status with timeout handling
* refactor: Enhance OAuth polling with gradual backoff and timeout handling; update reconnection tracking
* refactor: reconnection timeout behavior in OAuthReconnectionManager and OAuthReconnectionTracker
- Implement tests to verify reconnection timeout handling, including tracking of reconnection states and cleanup of timed-out entries.
- Enhance existing methods in OAuthReconnectionManager and OAuthReconnectionTracker to support timeout checks and cleanup logic.
- Ensure proper handling of multiple servers with different timeout periods and edge cases for active states.
* chore: remove comment
* refactor: Enforce strict 3-minute OAuth timeout with updated polling intervals and improved timeout handling
* refactor: Remove unused polling logic and prevent duplicate polling for servers in MCP server manager
* refactor: Update localization key for no memories message in MemoryViewer
* refactor: Improve MCP tool initialization by handling server failures
- Introduced a mechanism to track failed MCP servers, preventing retries for unavailable servers.
- Added logging for failed tool creation attempts to enhance debugging and monitoring.
* refactor: Update reconnection timeout to enforce a strict 3-minute limit
* ci: Update reconnection timeout tests to reflect a strict 3-minute limit
* ci: Update reconnection timeout tests to enforce a strict 3-minute limit
* chore: Remove unused MCP connection timeout message