🌊 fix: Prevent Buffered Event Duplication on SSE Resume Connections (#12225)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run

* fix: skipBufferReplay for job resume connections

- Introduced a new option `skipBufferReplay` in the `subscribe` method of `GenerationJobManagerClass` to prevent duplication of events when resuming a connection.
- Updated the logic to conditionally skip replaying buffered events if a sync event has already been sent, enhancing the efficiency of event handling during reconnections.
- Added integration tests to verify the correct behavior of the new option, ensuring that no buffered events are replayed when `skipBufferReplay` is true, while still allowing for normal replay behavior when false.

* refactor: Update GenerationJobManager to handle sync events more efficiently

- Modified the `subscribe` method to utilize a new `skipBufferReplay` option, allowing for the prevention of duplicate events during resume connections.
- Enhanced the logic in the `chat/stream` route to conditionally skip replaying buffered events if a sync event has already been sent, improving event handling efficiency.
- Updated integration tests to verify the correct behavior of the new option, ensuring that no buffered events are replayed when `skipBufferReplay` is true, while maintaining normal replay behavior when false.

* test: Enhance GenerationJobManager integration tests for Redis mode

- Updated integration tests to conditionally run based on the USE_REDIS environment variable, allowing for better control over Redis-related tests.
- Refactored test descriptions to utilize a dynamic `describeRedis` function, improving clarity and organization of tests related to Redis functionality.
- Removed redundant checks for Redis availability within individual tests, streamlining the test logic and enhancing readability.

* fix: sync handler state for new messages on resume

The sync event's else branch (new response message) was missing
resetContentHandler() and syncStepMessage() calls, leaving stale
handler state that caused subsequent deltas to build on partial
content instead of the synced aggregatedContent.

* feat: atomic subscribeWithResume to close resume event gap

Replaces separate getResumeState() + subscribe() calls with a single
subscribeWithResume() that atomically drains earlyEventBuffer between
the resume snapshot and the subscribe. In in-memory mode, drained events
are returned as pendingEvents for the client to replay after sync.
In Redis mode, pendingEvents is empty since chunks are already persisted.

The route handler now uses the atomic method for resume connections and
extracted shared SSE write helpers to reduce duplication. The client
replays any pendingEvents through the existing step/content handlers
after applying aggregatedContent from the sync payload.

* fix: only capture gap events in subscribeWithResume, not pre-snapshot buffer

The previous implementation drained the entire earlyEventBuffer into
pendingEvents, but pre-snapshot events are already reflected in
aggregatedContent. Replaying them re-introduced the duplication bug
through a different vector.

Now records buffer length before getResumeState() and slices from that
index, so only events arriving during the async gap are returned as
pendingEvents.

Also:
- Handle pendingEvents when resumeState is null (replay directly)
- Hoist duplicate test helpers to shared scope
- Remove redundant writableEnded guard in onDone
This commit is contained in:
Danny Avila 2026-03-14 10:54:26 -04:00 committed by GitHub
parent cbdc6f6060
commit 7bc793b18d
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 700 additions and 259 deletions

View file

@ -226,12 +226,12 @@ export default function useResumableSSE(
if (data.sync != null) {
console.log('[ResumableSSE] SYNC received', {
runSteps: data.resumeState?.runSteps?.length ?? 0,
pendingEvents: data.pendingEvents?.length ?? 0,
});
const runId = v4();
setActiveRunId(runId);
// Replay run steps
if (data.resumeState?.runSteps) {
for (const runStep of data.resumeState.runSteps) {
stepHandler({ event: 'on_run_step', data: runStep }, {
@ -241,19 +241,15 @@ export default function useResumableSSE(
}
}
// Set message content from aggregatedContent
if (data.resumeState?.aggregatedContent && userMessage?.messageId) {
const messages = getMessages() ?? [];
const userMsgId = userMessage.messageId;
const serverResponseId = data.resumeState.responseMessageId;
// Find the EXACT response message - prioritize responseMessageId from server
// This is critical when there are multiple responses to the same user message
let responseIdx = -1;
if (serverResponseId) {
responseIdx = messages.findIndex((m) => m.messageId === serverResponseId);
}
// Fallback: find by parentMessageId pattern (for new messages)
if (responseIdx < 0) {
responseIdx = messages.findIndex(
(m) =>
@ -272,7 +268,6 @@ export default function useResumableSSE(
});
if (responseIdx >= 0) {
// Update existing response message with aggregatedContent
const updated = [...messages];
const oldContent = updated[responseIdx]?.content;
updated[responseIdx] = {
@ -285,25 +280,34 @@ export default function useResumableSSE(
newContentLength: data.resumeState.aggregatedContent?.length,
});
setMessages(updated);
// Sync both content handler and step handler with the updated message
// so subsequent deltas build on synced content, not stale content
resetContentHandler();
syncStepMessage(updated[responseIdx]);
console.log('[ResumableSSE] SYNC complete, handlers synced');
} else {
// Add new response message
const responseId = serverResponseId ?? `${userMsgId}_`;
setMessages([
...messages,
{
messageId: responseId,
parentMessageId: userMsgId,
conversationId: currentSubmission.conversation?.conversationId ?? '',
text: '',
content: data.resumeState.aggregatedContent,
isCreatedByUser: false,
} as TMessage,
]);
const newMessage = {
messageId: responseId,
parentMessageId: userMsgId,
conversationId: currentSubmission.conversation?.conversationId ?? '',
text: '',
content: data.resumeState.aggregatedContent,
isCreatedByUser: false,
} as TMessage;
setMessages([...messages, newMessage]);
resetContentHandler();
syncStepMessage(newMessage);
}
}
if (data.pendingEvents?.length > 0) {
console.log(`[ResumableSSE] Replaying ${data.pendingEvents.length} pending events`);
const submission = { ...currentSubmission, userMessage } as EventSubmission;
for (const pendingEvent of data.pendingEvents) {
if (pendingEvent.event != null) {
stepHandler(pendingEvent, submission);
} else if (pendingEvent.type != null) {
contentHandler({ data: pendingEvent, submission });
}
}
}