🗂️ feat: Better Persistence for Code Execution Files Between Sessions (#11362)

* refactor: process code output files for re-use (WIP)

* feat: file attachment handling with additional metadata for downloads

* refactor: Update directory path logic for local file saving based on basePath

* refactor: file attachment handling to support TFile type and improve data merging logic

* feat: thread filtering of code-generated files

- Introduced parentMessageId parameter in addedConvo and initialize functions to enhance thread management.
- Updated related methods to utilize parentMessageId for retrieving messages and filtering code-generated files by conversation threads.
- Enhanced type definitions to include parentMessageId in relevant interfaces for better clarity and usage.

* chore: imports/params ordering

* feat: update file model to use messageId for filtering and processing

- Changed references from 'message' to 'messageId' in file-related methods for consistency.
- Added messageId field to the file schema and updated related types.
- Enhanced file processing logic to accommodate the new messageId structure.

* feat: enhance file retrieval methods to support user-uploaded execute_code files

- Added a new method `getUserCodeFiles` to retrieve user-uploaded execute_code files, excluding code-generated files.
- Updated existing file retrieval methods to improve filtering logic and handle edge cases.
- Enhanced thread data extraction to collect both message IDs and file IDs efficiently.
- Integrated `getUserCodeFiles` into relevant endpoints for better file management in conversations.

* chore: update @librechat/agents package version to 3.0.78 in package-lock.json and related package.json files

* refactor: file processing and retrieval logic

- Added a fallback mechanism for download URLs when files exceed size limits or cannot be processed locally.
- Implemented a deduplication strategy for code-generated files based on conversationId and filename to optimize storage.
- Updated file retrieval methods to ensure proper filtering by messageIds, preventing orphaned files from being included.
- Introduced comprehensive tests for new thread data extraction functionality, covering edge cases and performance considerations.

* fix: improve file retrieval tests and handling of optional properties

- Updated tests to safely access optional properties using non-null assertions.
- Modified test descriptions for clarity regarding the exclusion of execute_code files.
- Ensured that the retrieval logic correctly reflects the expected outcomes for file queries.

* test: add comprehensive unit tests for processCodeOutput functionality

- Introduced a new test suite for the processCodeOutput function, covering various scenarios including file retrieval, creation, and processing for both image and non-image files.
- Implemented mocks for dependencies such as axios, logger, and file models to isolate tests and ensure reliable outcomes.
- Validated behavior for existing files, new file creation, and error handling, including size limits and fallback mechanisms.
- Enhanced test coverage for metadata handling and usage increment logic, ensuring robust verification of file processing outcomes.

* test: enhance file size limit enforcement in processCodeOutput tests

- Introduced a configurable file size limit for tests to improve flexibility and coverage.
- Mocked the `librechat-data-provider` to allow dynamic adjustment of file size limits during tests.
- Updated the file size limit enforcement test to validate behavior when files exceed specified limits, ensuring proper fallback to download URLs.
- Reset file size limit after tests to maintain isolation for subsequent test cases.
This commit is contained in:
Danny Avila 2026-01-16 10:06:24 -05:00
parent c18dc0d894
commit 75c02a1a18
No known key found for this signature in database
GPG key ID: BF31EEB2C5CA0956
22 changed files with 1362 additions and 81 deletions

View file

@ -130,7 +130,7 @@ describe('File Methods', () => {
const files = await fileMethods.getFiles({ user: userId });
expect(files).toHaveLength(3);
expect(files.map((f) => f.file_id)).toEqual(expect.arrayContaining(fileIds));
expect(files!.map((f) => f.file_id)).toEqual(expect.arrayContaining(fileIds));
});
it('should exclude text field by default', async () => {
@ -149,7 +149,7 @@ describe('File Methods', () => {
const files = await fileMethods.getFiles({ file_id: fileId });
expect(files).toHaveLength(1);
expect(files[0].text).toBeUndefined();
expect(files![0].text).toBeUndefined();
});
});
@ -207,7 +207,7 @@ describe('File Methods', () => {
expect(files[0].file_id).toBe(contextFileId);
});
it('should retrieve files for execute_code tool', async () => {
it('should not retrieve execute_code files (handled by getCodeGeneratedFiles)', async () => {
const userId = new mongoose.Types.ObjectId();
const codeFileId = uuidv4();
@ -218,14 +218,16 @@ describe('File Methods', () => {
filepath: '/uploads/code.py',
type: 'text/x-python',
bytes: 100,
context: FileContext.execute_code,
metadata: { fileIdentifier: 'some-identifier' },
});
// execute_code files are explicitly excluded from getToolFilesByIds
// They are retrieved via getCodeGeneratedFiles and getUserCodeFiles instead
const toolSet = new Set([EToolResources.execute_code]);
const files = await fileMethods.getToolFilesByIds([codeFileId], toolSet);
expect(files).toHaveLength(1);
expect(files[0].file_id).toBe(codeFileId);
expect(files).toHaveLength(0);
});
});
@ -490,7 +492,7 @@ describe('File Methods', () => {
const remaining = await fileMethods.getFiles({});
expect(remaining).toHaveLength(1);
expect(remaining[0].user?.toString()).toBe(otherUserId.toString());
expect(remaining![0].user?.toString()).toBe(otherUserId.toString());
});
});