🧪 chore: MCP Reconnect Storm Follow-Up Fixes and Integration Tests (#12172)

* 🧪 test: Add reconnection storm regression tests for MCPConnection

Introduced a comprehensive test suite for reconnection storm scenarios, validating circuit breaker, throttling, cooldown, and timeout fixes. The tests utilize real MCP SDK transports and a StreamableHTTP server to ensure accurate behavior under rapid connect/disconnect cycles and error handling for SSE 400/405 responses. This enhances the reliability of the MCPConnection by ensuring proper handling of reconnection logic and circuit breaker functionality.

* 🔧 fix: Update createUnavailableToolStub to return structured response

Modified the `createUnavailableToolStub` function to return an array containing the unavailable message and a null value, enhancing the response structure. Additionally, added a debug log to skip tool creation when the result is null, improving the handling of reconnection scenarios in the MCP service.

* 🧪 test: Enhance MCP tool creation tests for cache and throttle interactions

Added new test cases for the `createMCPTool` function to validate the caching behavior when tools are unavailable or throttled. The tests ensure that tools are correctly cached as missing and prevent unnecessary reconnects across different users, improving the reliability of the MCP service under concurrent usage scenarios. Additionally, introduced a test for the `createMCPTools` function to verify that it returns an empty array when reconnect is throttled, ensuring proper handling of throttling logic.

* 📝 docs: Update AGENTS.md with testing philosophy and guidelines

Expanded the testing section in AGENTS.md to emphasize the importance of using real logic over mocks, advocating for the use of spies and real dependencies in tests. Added specific recommendations for testing with MongoDB and MCP SDK, highlighting the need to mock only uncontrollable external services. This update aims to improve testing practices and encourage more robust test implementations.

* 🧪 test: Enhance reconnection storm tests with socket tracking and SSE handling

Updated the reconnection storm test suite to include a new socket tracking mechanism for better resource management during tests. Improved the handling of SSE 400/405 responses by ensuring they are processed in the same branch as 404 errors, preventing unhandled cases. This enhances the reliability of the MCPConnection under rapid reconnect scenarios and ensures proper error handling.

* 🔧 fix: Implement cache eviction for stale reconnect attempts and missing tools

Added an `evictStale` function to manage the size of the `lastReconnectAttempts` and `missingToolCache` maps, ensuring they do not exceed a maximum cache size. This enhancement improves resource management by removing outdated entries based on a specified time-to-live (TTL), thereby optimizing the MCP service's performance during reconnection scenarios.
This commit is contained in:
Danny Avila 2026-03-10 17:44:13 -04:00 committed by GitHub
parent c0e876a2e6
commit 6167ce6e57
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 736 additions and 2 deletions

View file

@ -45,6 +45,7 @@ const {
getMCPSetupData,
checkOAuthFlowStatus,
getServerConnectionStatus,
createUnavailableToolStub,
} = require('./MCP');
jest.mock('./Config', () => ({
@ -1098,6 +1099,188 @@ describe('User parameter passing tests', () => {
});
});
describe('createUnavailableToolStub', () => {
it('should return a tool whose _call returns a valid CONTENT_AND_ARTIFACT two-tuple', async () => {
const stub = createUnavailableToolStub('myTool', 'myServer');
// invoke() goes through langchain's base tool, which checks responseFormat.
// CONTENT_AND_ARTIFACT requires [content, artifact] — a bare string would throw:
// "Tool response format is "content_and_artifact" but the output was not a two-tuple"
const result = await stub.invoke({});
// If we reach here without throwing, the two-tuple format is correct.
// invoke() returns the content portion of [content, artifact] as a string.
expect(result).toContain('temporarily unavailable');
});
});
describe('negative tool cache and throttle interaction', () => {
it('should cache tool as missing even when throttled (cross-user dedup)', async () => {
const mockUser = { id: 'throttle-test-user' };
const mockRes = { write: jest.fn(), flush: jest.fn() };
// First call: reconnect succeeds but tool not found
mockReinitMCPServer.mockResolvedValueOnce({
availableTools: {},
});
await createMCPTool({
res: mockRes,
user: mockUser,
toolKey: `missing-tool${D}cache-dedup-server`,
provider: 'openai',
userMCPAuthMap: {},
availableTools: undefined,
});
// Second call within 10s for DIFFERENT tool on same server:
// reconnect is throttled (returns null), tool is still cached as missing.
// This is intentional: the cache acts as cross-user dedup since the
// throttle is per-user-per-server and can't prevent N different users
// from each triggering their own reconnect.
const result2 = await createMCPTool({
res: mockRes,
user: mockUser,
toolKey: `other-tool${D}cache-dedup-server`,
provider: 'openai',
userMCPAuthMap: {},
availableTools: undefined,
});
expect(result2).toBeDefined();
expect(result2.name).toContain('other-tool');
expect(mockReinitMCPServer).toHaveBeenCalledTimes(1);
});
it('should prevent user B from triggering reconnect when user A already cached the tool', async () => {
const userA = { id: 'cache-user-A' };
const userB = { id: 'cache-user-B' };
const mockRes = { write: jest.fn(), flush: jest.fn() };
// User A: real reconnect, tool not found → cached
mockReinitMCPServer.mockResolvedValueOnce({
availableTools: {},
});
await createMCPTool({
res: mockRes,
user: userA,
toolKey: `shared-tool${D}cross-user-server`,
provider: 'openai',
userMCPAuthMap: {},
availableTools: undefined,
});
expect(mockReinitMCPServer).toHaveBeenCalledTimes(1);
// User B requests the SAME tool within 10s.
// The negative cache is keyed by toolKey (no user prefix), so user B
// gets a cache hit and no reconnect fires. This is the cross-user
// storm protection: without this, user B's unthrottled first request
// would trigger a second reconnect to the same server.
const result = await createMCPTool({
res: mockRes,
user: userB,
toolKey: `shared-tool${D}cross-user-server`,
provider: 'openai',
userMCPAuthMap: {},
availableTools: undefined,
});
expect(result).toBeDefined();
expect(result.name).toContain('shared-tool');
// reinitMCPServer still called only once — user B hit the cache
expect(mockReinitMCPServer).toHaveBeenCalledTimes(1);
});
it('should prevent user B from triggering reconnect for throttle-cached tools', async () => {
const userA = { id: 'storm-user-A' };
const userB = { id: 'storm-user-B' };
const mockRes = { write: jest.fn(), flush: jest.fn() };
// User A: real reconnect for tool-1, tool not found → cached
mockReinitMCPServer.mockResolvedValueOnce({
availableTools: {},
});
await createMCPTool({
res: mockRes,
user: userA,
toolKey: `tool-1${D}storm-server`,
provider: 'openai',
userMCPAuthMap: {},
availableTools: undefined,
});
// User A: tool-2 on same server within 10s → throttled → cached from throttle
await createMCPTool({
res: mockRes,
user: userA,
toolKey: `tool-2${D}storm-server`,
provider: 'openai',
userMCPAuthMap: {},
availableTools: undefined,
});
expect(mockReinitMCPServer).toHaveBeenCalledTimes(1);
// User B requests tool-2 — gets cache hit from the throttle-cached entry.
// Without this caching, user B would trigger a real reconnect since
// user B has their own throttle key and hasn't reconnected yet.
const result = await createMCPTool({
res: mockRes,
user: userB,
toolKey: `tool-2${D}storm-server`,
provider: 'openai',
userMCPAuthMap: {},
availableTools: undefined,
});
expect(result).toBeDefined();
expect(result.name).toContain('tool-2');
// Still only 1 real reconnect — user B was protected by the cache
expect(mockReinitMCPServer).toHaveBeenCalledTimes(1);
});
});
describe('createMCPTools throttle handling', () => {
it('should return empty array with debug log when reconnect is throttled', async () => {
const mockUser = { id: 'throttle-tools-user' };
const mockRes = { write: jest.fn(), flush: jest.fn() };
// First call: real reconnect
mockReinitMCPServer.mockResolvedValueOnce({
tools: [{ name: 'tool1' }],
availableTools: {
[`tool1${D}throttle-tools-server`]: {
function: { description: 'Tool 1', parameters: {} },
},
},
});
await createMCPTools({
res: mockRes,
user: mockUser,
serverName: 'throttle-tools-server',
provider: 'openai',
userMCPAuthMap: {},
});
// Second call within 10s — throttled
const result = await createMCPTools({
res: mockRes,
user: mockUser,
serverName: 'throttle-tools-server',
provider: 'openai',
userMCPAuthMap: {},
});
expect(result).toEqual([]);
// reinitMCPServer called only once — second was throttled
expect(mockReinitMCPServer).toHaveBeenCalledTimes(1);
// Should log at debug level (not warn) for throttled case
expect(logger.debug).toHaveBeenCalledWith(expect.stringContaining('Reconnect throttled'));
});
});
describe('User parameter integrity', () => {
it('should preserve user object properties through the call chain', async () => {
const complexUser = {