Commit graph

4 commits

Author SHA1 Message Date
matt burnett
ad5c51f62b
⛈️ fix: MCP Reconnection Storm Prevention with Circuit Breaker, Backoff, and Tool Stubs (#12162)
* fix: MCP reconnection stability - circuit breaker, throttling, and cooldown retry

* Comment and logging cleanup

* fix broken tests
2026-03-10 14:21:36 -04:00
Danny Avila
a0f9782e60
🪣 fix: Prevent Memory Retention from AsyncLocalStorage Context Propagation (#11942)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
* fix: store hide_sequential_outputs before processStream clears config

processStream now clears config.configurable after completion to break
memory retention chains. Save hide_sequential_outputs to a local
variable before calling runAgents so the post-stream filter still works.

* feat: memory diagnostics

* chore: expose garbage collection in backend inspect command

Updated the backend inspect command in package.json to include the --expose-gc flag, enabling garbage collection diagnostics for improved memory management during development.

* chore: update @librechat/agents dependency to version 3.1.52

Bumped the version of @librechat/agents in package.json and package-lock.json to ensure compatibility and access to the latest features and fixes.

* fix: clear heavy config state after processStream to prevent memory leaks

Break the reference chain from LangGraph's internal __pregel_scratchpad
through @langchain/core RunTree.extra[lc:child_config] into the
AsyncLocalStorage context captured by timers and I/O handles.

After stream completion, null out symbol-keyed scratchpad properties
(currentTaskInput), config.configurable, and callbacks. Also call
Graph.clearHeavyState() to release config, signal, content maps,
handler registry, and tool sessions.

* chore: fix imports for memory utils

* chore: add circular dependency check in API build step

Enhanced the backend review workflow to include a check for circular dependencies during the API build process. If a circular dependency is detected, an error message is displayed, and the process exits with a failure status.

* chore: update API build step to include circular dependency detection

Modified the backend review workflow to rename the API package installation step to reflect its new functionality, which now includes detection of circular dependencies during the build process.

* chore: add memory diagnostics option to .env.example

Included a commented-out configuration option for enabling memory diagnostics in the .env.example file, which logs heap and RSS snapshots every 60 seconds when activated.

* chore: remove redundant agentContexts cleanup in disposeClient function

Streamlined the disposeClient function by eliminating duplicate cleanup logic for agentContexts, ensuring efficient memory management during client disposal.

* refactor: move runOutsideTracing utility to utils and update its usage

Refactored the runOutsideTracing function by relocating it to the utils module for better organization. Updated the tool execution handler to utilize the new import, ensuring consistent tracing behavior during tool execution.

* refactor: enhance connection management and diagnostics

Added a method to ConnectionsRepository for retrieving the active connection count. Updated UserConnectionManager to utilize this new method for app connection count reporting. Refined the OAuthReconnectionTracker's getStats method to improve clarity in diagnostics. Introduced a new tracing utility in the utils module to streamline tracing context management. Additionally, added a safeguard in memory diagnostics to prevent unnecessary snapshot collection for very short intervals.

* refactor: enhance tracing utility and add memory diagnostics tests

Refactored the runOutsideTracing function to improve warning logic when the AsyncLocalStorage context is missing. Added tests for memory diagnostics and tracing utilities to ensure proper functionality and error handling. Introduced a new test suite for memory diagnostics, covering snapshot collection and garbage collection behavior.
2026-02-25 17:41:23 -05:00
Danny Avila
96870e0da0
refactor: MCP OAuth Polling with Gradual Backoff and Timeout Handling (#9752)
Some checks failed
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Has been cancelled
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Has been cancelled
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Has been cancelled
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Has been cancelled
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Has been cancelled
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Has been cancelled
* refactor: Implement gradual backoff polling for oauth connection status with timeout handling

* refactor: Enhance OAuth polling with gradual backoff and timeout handling; update reconnection tracking

* refactor: reconnection timeout behavior in OAuthReconnectionManager and OAuthReconnectionTracker

- Implement tests to verify reconnection timeout handling, including tracking of reconnection states and cleanup of timed-out entries.
- Enhance existing methods in OAuthReconnectionManager and OAuthReconnectionTracker to support timeout checks and cleanup logic.
- Ensure proper handling of multiple servers with different timeout periods and edge cases for active states.

* chore: remove comment

* refactor: Enforce strict 3-minute OAuth timeout with updated polling intervals and improved timeout handling

* refactor: Remove unused polling logic and prevent duplicate polling for servers in MCP server manager

* refactor: Update localization key for no memories message in MemoryViewer

* refactor: Improve MCP tool initialization by handling server failures

- Introduced a mechanism to track failed MCP servers, preventing retries for unavailable servers.
- Added logging for failed tool creation attempts to enhance debugging and monitoring.

* refactor: Update reconnection timeout to enforce a strict 3-minute limit

* ci: Update reconnection timeout tests to reflect a strict 3-minute limit

* ci: Update reconnection timeout tests to enforce a strict 3-minute limit

* chore: Remove unused MCP connection timeout message
2025-09-21 22:58:19 -04:00
Federico Ruggi
d04da60b3b
💫 feat: MCP OAuth Auto-Reconnect (#9646)
* add oauth reconnect tracker

* add connection tracker to mcp manager

* reconnect oauth mcp servers function

* call reconnection in auth controller

* make sure to check connection in panel

* wait for isConnected

* add const for poll interval

* add logging to tryReconnect

* check expiration

* check mcp manager is not null

* check mcp manager is not null

* add test for reconnecting mcp server

* unify logic inside OAuthReconnectionManager

* test reconnection manager, adjust

* chore: reorder import statements in index.js

* chore: imports

* chore: imports

* chore: imports

* chore: imports

* chore: imports

* chore: imports and use types explicitly

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2025-09-17 16:49:36 -04:00