LibreChat

mirror of https://github.com/danny-avila/LibreChat.git synced 2026-04-04 06:47:19 +02:00

Author	SHA1	Message	Date
Danny Avila	e442984364	💣 fix: Harden against falsified ZIP metadata in ODT parsing (#12320 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details Publish `@librechat/client` to NPM / build-and-publish (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions Details * security: replace JSZip metadata guard with yauzl streaming decompression The ODT decompressed-size guard was checking JSZip's private _data.uncompressedSize fields, which are populated from the ZIP central directory — attacker-controlled metadata. A crafted ODT with falsified uncompressedSize values bypassed the 50MB cap entirely, allowing content.xml decompression to exhaust Node.js heap memory (DoS). Replace JSZip with yauzl for ODT extraction. The new extractOdtContentXml function uses yauzl's streaming API: it lazily iterates ZIP entries, opens a decompression stream for content.xml, and counts real bytes as they arrive from the inflate stream. The stream is destroyed the moment the byte count crosses ODT_MAX_DECOMPRESSED_SIZE, aborting the inflate before the full payload is materialised in memory. - Remove jszip from direct dependencies (still transitive via mammoth) - Add yauzl + @types/yauzl - Update zip-bomb test to verify streaming abort with DEFLATE payload * fix: close file descriptor leaks and declare jszip test dependency - Use a shared `finish()` helper in extractOdtContentXml that calls zipfile.close() on every exit path (success, size cap, missing entry, openReadStream errors, zipfile errors). Without this, any error path leaked one OS file descriptor permanently — uploading many malformed ODTs could exhaust the process FD limit (a distinct DoS vector). - Add jszip to devDependencies so the zip-bomb test has an explicit dependency rather than relying on mammoth's transitive jszip. - Update JSDoc to document that all exit paths close the zipfile. * fix: move yauzl from dependencies to peerDependencies Matches the established pattern for runtime parser libraries in packages/api: mammoth, pdfjs-dist, and xlsx are all peerDependencies (provided by the consuming /api workspace) with devDependencies for testing. yauzl was incorrectly placed in dependencies. * fix: add yauzl to /api dependencies to satisfy peer dep packages/api declares yauzl as a peerDependency; /api is the consuming workspace that must provide it at runtime, matching the pattern used for mammoth, pdfjs-dist, and xlsx.	2026-03-19 22:13:40 -04:00
Pol Burkardt Freire	7e74165c3c	📖 feat: Add Native ODT Document Parser Support (#12303 ) * fix: add ODT support to native document parser * fix: replace execSync with jszip for ODT parsing * docs: update documentParserMimeTypes comment to include odt * fix: improve ODT XML extraction and add empty.odt fixture - Scope extraction to <office:body> to exclude metadata/style nodes - Map </text:p> and </text:h> closings to newlines, preserving paragraph structure instead of collapsing everything to a single line - Handle <text:line-break/> as explicit newlines - Strip remaining tags, normalize horizontal whitespace, cap consecutive blank lines at one - Regenerate sample.odt as a two-paragraph fixture so the test exercises multi-paragraph output - Add empty.odt fixture and test asserting 'No text found in document' * fix: address review findings in ODT parser - Use static `import JSZip from 'jszip'` instead of dynamic import; jszip is CommonJS-only with no ESM/Jest-isolation concern (F1) - Decode the five standard XML entities after tag-stripping so documents with &, <, >, ", ' send correct text to the LLM (F2) - Remove @types/jszip devDependency; jszip ships bundled declarations and @types/jszip is a stale 2020 stub that would shadow them (F3) - Handle <text:tab/> → \t and <text:s .../> → ' ' before the generic tag stripper so tab-aligned and multi-space content is preserved (F4) - Add sample-entities.odt fixture and test covering entity decoding, tab, and spacing-element handling (F5) - Rename 'throws for empty odt' → 'throws for odt with no extractable text' to distinguish from a zero-byte/corrupt file case (F8) * fix: add decompressed content size cap to odtToText (F6) Reads uncompressed entry sizes from the JSZip internal metadata before extracting any content. Throws if the total exceeds 50MB, preventing a crafted ODT with a high-ratio compressed payload from exhausting heap. Adds a corresponding test using a real DEFLATE-compressed ZIP (~51KB on disk, 51MB uncompressed) to verify the guard fires before any extraction. * fix: add java to codeTypeMapping for file upload support .java files were rejected with "Unable to determine file type" because browsers send an empty MIME type for them and codeTypeMapping had no 'java' entry for inferMimeType() to fall back on. text/x-java was already present in all five validation lists (fullMimeTypesList, codeInterpreterMimeTypesList, retrievalMimeTypesList, textMimeTypes, retrievalMimeTypes), so mapping to it (not text/plain) ensures .java uploads work for both File Search and Code Interpreter. Closes #12307 * fix: address follow-up review findings (A-E) A: regenerate package-lock.json after removing @types/jszip from package.json; without this npm ci was still installing the stale 2020 type stubs and TypeScript was resolving against them B: replace dynamic import('jszip') in the zip-bomb test with the same static import already used in production; jszip is CJS-only with no ESM/Jest isolation concern C: document that the _data.uncompressedSize guard fails open if jszip renames the private field (accepted limitation, test would catch it) D: rename 'preserves tabs' test to 'normalizes tab and spacing elements to spaces' since <text:tab> is collapsed to a space, not kept as \t E: fix test.each([ formatting artifact (missing newline after '[') --------- Co-authored-by: Danny Avila <danny@librechat.ai>	2026-03-19 15:49:52 -04:00
Danny Avila	68435cdcd0	🧯 fix: Add Pre-Parse File Size Guard to Document Parser (#12275 ) Prevent memory exhaustion DoS by rejecting documents exceeding 15MB before reading them into memory, closing the gap between the 512MB upload limit and unbounded in-memory parsing.	2026-03-17 02:36:18 -04:00
Danny Avila	c6982dc180	🛡️ fix: Agent Permission Check on Image Upload Route (#12219 ) * fix: add agent permission check to image upload route * refactor: remove unused SystemRoles import and format test file for clarity * fix: address review findings for image upload agent permission check * refactor: move agent upload auth logic to TypeScript in packages/api Extract pure authorization logic from agentPermCheck.js into checkAgentUploadAuth() in packages/api/src/files/agentUploadAuth.ts. The function returns a structured result ({ allowed, status, error }) instead of writing HTTP responses directly, eliminating the dual responsibility and confusing sentinel return value. The JS wrapper in /api is now a thin adapter that translates the result to HTTP. * test: rewrite image upload permission tests as integration tests Replace mock-heavy images-agent-perm.spec.js with integration tests using MongoMemoryServer, real models, and real PermissionService. Follows the established pattern in files.agents.test.js. Moves test to sibling location (images.agents.test.js) matching backend convention. Adds temp file cleanup assertions on 403/404 responses and covers message_file exemption paths (boolean true, string "true", false). * fix: widen AgentUploadAuthDeps types to accept ObjectId from Mongoose The injected getAgent returns Mongoose documents where _id and author are Types.ObjectId at runtime, not string. Widen the DI interface to accept string \| Types.ObjectId for _id, author, and resourceId so the contract accurately reflects real callers. * chore: move agent upload auth into files/agents/ subdirectory * refactor: delete agentPermCheck.js wrapper, move verifyAgentUploadPermission to packages/api The /api-only dependencies (getAgent, checkPermission) are now passed as object-field params from the route call sites. Both images.js and files.js import verifyAgentUploadPermission from @librechat/api and inject the deps directly, eliminating the intermediate JS wrapper. * style: fix import type ordering in agent upload auth * fix: prevent token TTL race in MCPTokenStorage.storeTokens When expires_in is provided, use it directly instead of round-tripping through Date arithmetic. The previous code computed accessTokenExpiry as a Date, then after an async encryptV2 call, recomputed expiresIn by subtracting Date.now(). On loaded CI runners the elapsed time caused Math.floor to truncate to 0, triggering the 1-year fallback and making the token appear permanently valid — so refresh never fired.	2026-03-14 02:57:56 -04:00
Danny Avila	3b84cc048a	🧮 fix: XLSX/XLS Upload-as-Text via Buffer-Based SheetJS Parsing (#12098 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details * 🔧 fix: Update Excel sheet parsing to use fs.promises.readFile and correct import for xlsx - Modified the excelSheetToText function to read the file using fs.promises.readFile instead of directly accessing the file path. - Updated the import statement for the xlsx library to use the correct read method, ensuring proper functionality in parsing Excel sheets. * 🔧 fix: Update document parsing methods to use buffer for file reading - Modified the wordDocToText function to read the file as a buffer using fs.promises.readFile, ensuring compatibility with the mammoth library. - Updated the excelSheetToText function to read the Excel file as a buffer, addressing issues with the xlsx library's handling of dynamic imports and file access. * feat: Add tests for empty xlsx document parsing and validate xlsx imports - Introduced a new test case to verify that the `parseDocument` function correctly handles an empty xlsx file with only a sheet name, ensuring it returns the expected document structure. - Added a test to confirm that the `xlsx` library exports `read` and `utils` as named imports, validating the functionality of the library integration. - Included a new empty xlsx file to support the test cases.	2026-03-06 00:21:55 -05:00
Danny Avila	046e92217f	🧩 feat: OpenDocument Format File Upload and Native ODS Parsing (#11959 ) * ✨ feat: Add support for OpenDocument MIME types in file configuration Updated the applicationMimeTypes regex to include support for OASIS OpenDocument formats, enhancing the file type recognition capabilities of the data provider. * feat: document processing with OpenDocument support Added support for OpenDocument Spreadsheet (ODS) MIME type in the file processing service and updated the document parser to handle ODS files. Included tests to verify correct parsing of ODS documents and updated file configuration to recognize OpenDocument formats. * refactor: Enhance document processing to support additional Excel MIME types Updated the document processing logic to utilize a regex for matching Excel MIME types, improving flexibility in handling various Excel file formats. Added tests to ensure correct parsing of new MIME types, including multiple Excel variants and OpenDocument formats. Adjusted file configuration to include these MIME types for better recognition in the file processing service. * feat: Add support for additional OpenDocument MIME types in file processing Enhanced the document processing service to support ODT, ODP, and ODG MIME types. Updated tests to verify correct routing through the OCR strategy for these new formats. Adjusted documentation to reflect changes in handled MIME types for improved clarity.	2026-02-26 14:39:49 -05:00
Dustin Healy	1d0a4c501f	🪨 feat: AWS Bedrock Document Uploads (#11912 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions Details * feat: add aws bedrock upload to provider support * chore: address copilot comments * feat: add shared Bedrock document format types and MIME mapping Bedrock Converse API accepts 9 document formats beyond PDF. Add BedrockDocumentFormat union type, MIME-to-format mapping, and helpers in data-provider so both client and backend can reference them. * refactor: generalize Bedrock PDF validation to support all document types Rename validateBedrockPdf to validateBedrockDocument with MIME-aware logic: 4.5MB hard limit applies to all types, PDF header check only runs for application/pdf. Adds test coverage for non-PDF documents. * feat: support all Bedrock document formats in encoding pipeline Widen file type gates to accept csv, doc, docx, xls, xlsx, html, txt, md for Bedrock. Uses shared MIME-to-format map instead of hardcoded 'pdf'. Other providers' PDF-only paths remain unchanged. * feat: expand Bedrock file upload UI to accept all document types Add 'image_document_extended' upload type for Bedrock with accept filters for all 9 supported formats. Update drag-and-drop validation to use isBedrockDocumentType helper. * fix: route Bedrock document types through provider pipeline	2026-02-23 22:32:44 -05:00
Danny Avila	7ce898d6a0	📄 feat: Local Text Extraction for PDF, DOCX, and XLS/XLSX (#11900 ) * feat: Added "document parser" OCR strategy The document parser uses libraries to parse the text out of known document types. This lets LibreChat handle some complex document types without having to use a secondary service (like Mistral or standing up a RAG API server). To enable the document parser, set the ocr strategy to "document_parser" in librechat.yaml. We now support: - PDFs using pdfjs - DOCX using mammoth - XLS/XLSX using SheetJS (The associated packages were also added to the project.) * fix: applied Copilot code review suggestions - Properly calculate length of text based on UTF8. - Avoid issues with loading / blocking PDF parsing. * fix: improved docs on parseDocument() * chore: move to packages/api for TS support * refactor: make document processing the default ocr strategy - Introduced support for additional document types in the OCR strategy, including PDF, DOCX, and XLS/XLSX. - Updated the file upload handling to dynamically select the appropriate parsing strategy based on the file type. - Refactored the document parsing functions to use asynchronous imports for improved performance and maintainability. * test: add unit tests for processAgentFileUpload functionality - Introduced a new test suite for the processAgentFileUpload function in process.spec.js. - Implemented various test cases to validate OCR strategy selection based on file types, including PDF, DOCX, XLSX, and XLS. - Mocked dependencies to ensure isolated testing of file upload handling and strategy selection logic. - Enhanced coverage for scenarios involving OCR capability checks and default strategy fallbacks. * chore: update pdfjs-dist version and enhance document parsing tests - Bumped pdfjs-dist dependency to version 5.4.624 in both api and packages/api. - Refactored document parsing tests to use 'originalname' instead of 'filename' for file objects. - Added a new test case for parsing XLS files to improve coverage of document types supported by the parser. - Introduced a sample XLS file for testing purposes. * feat: enforce text size limit and improve OCR fallback handling in processAgentFileUpload - Added a check to ensure extracted text does not exceed the 15MB storage limit, throwing an error if it does. - Refactored the OCR handling logic to improve fallback behavior when the configured OCR fails, ensuring a more robust document processing flow. - Enhanced unit tests to cover scenarios for oversized text and fallback mechanisms, ensuring proper error handling and functionality. * fix: correct OCR URL construction in performOCR function - Updated the OCR URL construction to ensure it correctly appends '/ocr' to the base URL if not already present, improving the reliability of the OCR request. --------- Co-authored-by: Dan Lew <daniel@mightyacorn.com>	2026-02-22 14:22:45 -05:00
ethanlaj	2513e0a423	🔧 feat: `deleteRagFile` utility for Consistent RAG API document deletion (#11493 ) * 🔧 feat: Implement deleteRagFile utility for RAG API document deletion across storage strategies * chore: import order * chore: import order & remove unnecessary comments --------- Co-authored-by: Danny Avila <danacordially@gmail.com>	2026-02-14 13:57:01 -05:00
papasaidfine	4fe223eedd	🎞️ feat: OpenRouter Audio/Video File Upload Support (#11070 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details * Added video upload support for OpenRouter - Added VIDEO_URL content type to support video_url message format - Implemented OpenRouter video encoding using base64 data URLs - Extended encodeAndFormatVideos() to handle OpenRouter provider - Updated UI to accept video uploads for OpenRouter (mp4, webm, mpeg, mov) - Fixed case-sensitivity in provider detection for agents - Made isDocumentSupportedProvider() and isOpenAILikeProvider() case-insensitive Videos are now converted to data:video/mp4;base64,... format compatible with OpenRouter's API requirements per their documentation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * refactor: change multimodal and google_multimodal to more transparent variable names of image_document and image_document_video_audio (also google_multimodal doesn't apply as much since we are adding support for video and audio uploads for open router) * fix: revert .toLowerCase change to isOpenAILikeProvider and isDocumentSupportedProvider which broke upload to provider detection for openAI endpoints * wip: add audio support to openrouter * fix: filetypes now properly parsed and sent rather than destructured mimetypes for openrouter * refactor: Omit to Exclude for ESLint * feat: update DragDropModal for new openrouter support * fix: special case openrouter for lower case provider (currently getting issues with the provider coming in as 'OpenRouter' and our enum being 'openrouter') This will probably require a larger refactor later to handle case insensitivity for all providers, but that will have to be thoroughly tested in its own isolated PR --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> Co-authored-by: Dustin Healy <54083382+dustinhealy@users.noreply.github.com>	2025-12-25 13:23:29 -05:00
rossbg	959984f959	⏱️ fix: Increase RAG API Text Parsing Timeout (#10562 ) * fix: increase RAG API text parsing timeout for large files * ci: Update text.spec.ts --------- Co-authored-by: Rosen Simov <rosen.simov@endurosat.com> Co-authored-by: Danny Avila <danny@librechat.ai>	2025-11-25 14:54:53 -05:00
Dustin Healy	dfcaff9b00	📷 fix: Use 'media' type for Google multimodal attachments (#10586 ) Some checks failed Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Has been cancelled Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Has been cancelled Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Has been cancelled Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Has been cancelled Details * fix: change google multimodal attachments to use type: 'media' * chore: Update @librechat/agents to version 3.0.27 in package.json and package-lock.json --------- Co-authored-by: Danny Avila <danny@librechat.ai>	2025-11-19 18:31:05 -05:00
Danny Avila	937563f645	🖼️ feat: File Size and MIME Type Filtering at Agent level (#10446 ) * refactor: add image file size validation as part of payload build * feat: implement file size and MIME type filtering in endpoint configuration * chore: import order	2025-11-10 21:36:48 -05:00
Danny Avila	2524d33362	📂 refactor: Cleanup File Filtering Logic, Improve Validation (#10414 ) * feat: add filterFilesByEndpointConfig to filter disabled file processing by provider * chore: explicit define of endpointFileConfig for better debugging * refactor: move `normalizeEndpointName` to data-provider as used app-wide * chore: remove overrideEndpoint from useFileHandling * refactor: improve endpoint file config selection * refactor: update filterFilesByEndpointConfig to accept structured parameters and improve endpoint file config handling * refactor: replace defaultFileConfig with getEndpointFileConfig for improved file configuration handling across components * test: add comprehensive unit tests for getEndpointFileConfig to validate endpoint configuration handling * refactor: streamline agent endpoint assignment and improve file filtering logic * feat: add error handling for disabled file uploads in endpoint configuration * refactor: update encodeAndFormat functions to accept structured parameters for provider and endpoint * refactor: streamline requestFiles handling in initializeAgent function * fix: getEndpointFileConfig partial config merging scenarios * refactor: enhance mergeWithDefault function to support document-supported providers with comprehensive MIME types * refactor: user-configured default file config in getEndpointFileConfig * fix: prevent file handling when endpoint is disabled and file is dragged to chat * refactor: move `getEndpointField` to `data-provider` and update usage across components and hooks * fix: prioritize endpointType based on agent.endpoint in file filtering logic * fix: prioritize agent.endpoint in file filtering logic and remove unnecessary endpointType defaulting	2025-11-10 19:05:30 -05:00
Danny Avila	360ec22964	⚗️ refactor: Provider File Validation with Configurable Size Limits (#10405 ) Some checks failed Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Has been cancelled Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Has been cancelled Details * chore: correct type for ServerRequest * chore: improve ServerRequest typing across several modules * feat: Add PDF configured limit validation - Introduced comprehensive tests for PDF validation across multiple providers, ensuring correct behavior for file size limits and edge cases. - Enhanced the `validatePdf` function to accept an optional configured file size limit, allowing for stricter validation based on user configurations. - Updated related functions to utilize the new validation logic, ensuring consistent behavior across different providers. * chore: Update Request type to ServerRequest in audio and video encoding modules * refactor: move `getConfiguredFileSizeLimit` utility * feat: Add video and audio validation with configurable size limits - Introduced `validateVideo` and `validateAudio` functions to validate media files against provider-specific size limits. - Enhanced validation logic to consider optional configured file size limits, allowing for more flexible file handling. - Added comprehensive tests for video and audio validation across different providers, ensuring correct behavior for various scenarios. * refactor: Update PDF and media validation to allow higher configured limits - Modified validation logic to accept user-configured file size limits that exceed provider defaults, ensuring correct acceptance of files within the specified range. - Updated tests to reflect changes in validation behavior, confirming that files are accepted when within the configured limits. - Enhanced documentation in tests to clarify expected outcomes with the new validation rules. * chore: Add @types/node-fetch dependency to package.json and package-lock.json - Included the @types/node-fetch package to enhance type definitions for node-fetch usage. - Updated package-lock.json to reflect the addition of the new dependency. * fix: Rename FileConfigInput to TFileConfig	2025-11-07 10:57:15 -05:00
Danny Avila	f59daaeecc	📄 feat: Context Field for Anthropic Documents (PDF) (#10148 ) * fix: Remove ephemeral cache control from document encoding function * refactor: Improve document encoding types and add file context for anthropic messages api - Added AnthropicDocumentBlock interface to define the structure for documents from the Anthropic provider. - Updated encodeAndFormatDocuments function to utilize the new type and include optional context for filenames. - Refactored DocumentResult to use a union type for various document formats, improving type safety and clarity.	2025-10-16 16:24:14 -04:00
Danny Avila	07d0abc9fd	🖼️ fix: Extract File Context & Persist Attachments (#10069 ) - problem: `addImageUrls` had a side effect that was being leveraged before to populate both the `ocr` message field, now `fileContext`, and `client.options.attachments`, which would record the user's uploaded message attachments to the user message when saved to the database and returned at the end of the request lifecycle - solution: created dedicated handling for file context, and made sure to populate `allFiles` with non-provider attachments	2025-10-10 05:35:37 -04:00
Danny Avila	bcd97aad2f	📎 feat: Direct Provider Attachment Support for Multimodal Content (#9994 ) Some checks failed Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Has been cancelled Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Has been cancelled Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Has been cancelled Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Has been cancelled Details * 📎 feat: Direct Provider Attachment Support for Multimodal Content * 📑 feat: Anthropic Direct Provider Upload (#9072) * feat: implement Anthropic native PDF support with document preservation - Add comprehensive debug logging throughout PDF processing pipeline - Refactor attachment processing to separate image and document handling - Create distinct addImageURLs(), addDocuments(), and processAttachments() methods - Fix critical bugs in stream handling and parameter passing - Add streamToBuffer utility for proper stream-to-buffer conversion - Remove api/agents submodule from repository 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * chore: remove out of scope formatting changes * fix: stop duplication of file in chat on end of response stream * chore: bring back file search and ocr options * chore: localize upload to provider string in file menu * refactor: change createMenuItems args to fit new pattern introduced by anthropic-native-pdf-support * feat: add cache point for pdfs processed by anthropic endpoint since they are unlikely to change and should benefit from caching * feat: combine Upload Image into Upload to Provider since they both perform direct upload and change provider upload icon to reflect multimodal upload * feat: add citations support according to docs * refactor: remove redundant 'document' check since documents are handled properly by formatMessage in the agents repo now * refactor: change upload logic so anthropic endpoint isn't exempted from normal upload path using Agents for consistency with the rest of the upload logic * fix: include width and height in return from uploadLocalFile so images are correctly identified when going through an AgentUpload in addImageURLs * chore: remove client specific handling since the direct provider stuff is handled by the agent client * feat: handle documents in AgentClient so no need for change to agents repo * chore: removed unused changes * chore: remove auto generated comments from OG commit * feat: add logic for agents to use direct to provider uploads if supported (currently just anthropic) * fix: reintroduce role check to fix render error because of undefined value for Content Part * fix: actually fix render bug by using proper isCreatedByUser check and making sure our mutation of formattedMessage.content is consistent --------- Co-authored-by: Andres Restrepo <andres@thelinuxkid.com> Co-authored-by: Claude <noreply@anthropic.com> 📁 feat: Send Attachments Directly to Provider (OpenAI) (#9098) * refactor: change references from direct upload to direct attach to better reflect functionality since we are just using base64 encoding strategy now rather than Files/File API for sending our attachments directly to the provider, the upload nomenclature no longer makes sense. direct_attach better describes the different methods of sending attachments to providers anyways even if we later introduce direct upload support * feat: add upload to provider option for openai (and agent) ui * chore: move anthropic pdf validator over to packages/api * feat: simple pdf validation according to openai docs * feat: add provider agnostic validatePdf logic to start handling multiple endpoints * feat: add handling for openai specific documentPart formatting * refactor: move require statement to proper place at top of file * chore: add in openAI endpoint for the rest of the document handling logic * feat: add direct attach support for azureOpenAI endpoint and agents * feat: add pdf validation for azureOpenAI endpoint * refactor: unify all the endpoint checks with isDocumentSupportedEndpoint * refactor: consolidate Upload to Provider vs Upload image logic for clarity * refactor: remove anthropic from anthropic_multimodal fileType since we support multiple providers now 🗂️ feat: Send Attachments Directly to Provider (Google) (#9100) * feat: add validation for google PDFs and add google endpoint as a document supporting endpoint * feat: add proper pdf formatting for google endpoints (requires PR #14 in agents) * feat: add multimodal support for google endpoint attachments * feat: add audio file svg * fix: refactor attachments logic so multi-attachment messages work properly * feat: add video file svg * fix: allows for followup questions of uploaded multimodal attachments * fix: remove incorrect final message filtering that was breaking Attachment component rendering fix: manualy rename 'documents' to 'Documents' in git since it wasn't picked up due to case insensitivity in dir name fix: add logic so filepicker for a google agent has proper filetype filtering 🛫 refactor: Move Encoding Logic to packages/api (#9182) * refactor: move audio encode over to TS * refactor: audio encoding now functional in LC again * refactor: move video encode over to TS * refactor: move document encode over to TS * refactor: video encoding now functional in LC again * refactor: document encoding now functional in LC again * fix: extend file type options in AttachFileMenu to include 'google_multimodal' and update dependency array to include agent?.provider * feat: only accept pdfs if responses api is enabled for openai convos chore: address ESLint comments chore: add missing audio mimetype * fix: type safety for message content parts and improve null handling * chore: reorder AttachFileMenuProps for consistency and clarity * chore: import order in AttachFileMenu * fix: improve null handling for text parts in parseTextParts function * fix: remove no longer used unsupported capability error message for file uploads * fix: OpenAI Direct File Attachment Format * fix: update encodeAndFormatDocuments to support OpenAI responses API and enhance document result types * refactor: broaden providers supported for documents * feat: enhance DragDrop context and modal to support document uploads based on provider capabilities * fix: reorder import statements for consistency in video encoding module --------- Co-authored-by: Dustin Healy <54083382+dustinhealy@users.noreply.github.com>	2025-10-06 17:30:16 -04:00
Danny Avila	4b5b46604c	🔍 refactor: OCR Fully Optional with Defaults for "Upload as Text" (#9856 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions Details * refactor: move `loadOCRConfig` from `packages/data-provider` to `packages/api` and return `undefined` if not explicitly configured * fix: loadOCRConfig import from @librechat/api * refactor: update defaultTextMimeTypes to support virtually all file types for text parsing * fix: improve OCR capability check and error message for unsupported file types * ci: remove unnecessary ocr expectation from AppService test	2025-09-26 11:56:11 -04:00
Danny Avila	2489670f54	📂 refactor: File Read Operations (#9747 ) * fix: axios response logging for text parsing, remove console logging, remove jsdoc * refactor: error logging in logAxiosError function to handle various error types with type guards * refactor: enhance text parsing with improved error handling and async file reading * refactor: replace synchronous file reading with asynchronous methods for improved performance and memory management * ci: update tests	2025-09-20 10:17:24 -04:00
Danny Avila	5bfb06b417	💻 feat: Add Proxy Config for Mistral OCR API (#9629 ) * 💻 feat: Add proxy configuration support for Mistral OCR API requests * refactor: Implement proxy support for Mistral API requests using HttpsProxyAgent	2025-09-14 18:50:41 -04:00
Danny Avila	e2a6937ca6	⚙️ fix: Update OCR context to use `req.config` (#9367 )	2025-08-29 10:06:03 -04:00
Danny Avila	7742b18c9c	🔧 fix: Upload Audio as Text missing Param (#9356 )	2025-08-28 21:07:30 -04:00
Danny Avila	48f6f8f2f8	📎 feat: Upload as Text Support for Plaintext, STT, RAG, and Token Limits (#8868 ) * 🪶 feat: Add Support for Uploading Plaintext Files feat: delineate between OCR and text handling in fileConfig field of config file - also adds support for passing in mimetypes as just plain file extensions feat: add showLabel bool to support future synthetic component DynamicDropdownInput feat: add new combination dropdown-input component in params panel to support file type token limits refactor: move hovercard to side to align with other hovercards chore: clean up autogenerated comments feat: add delineation to file upload path between text and ocr configured filetypes feat: add token limit checks during file upload refactor: move textParsing out of ocrEnabled logic refactor: clean up types for filetype config refactor: finish decoupling DynamicDropdownInput from fileTokenLimits fix: move image token cost function into file to fix circular dependency causing unittest to fail and remove unused var for linter chore: remove out of scope code following review refactor: make fileTokenLimit conform to existing styles chore: remove unused localization string chore: undo changes to DynamicInput and other strays feat: add fileTokenLimit to all provider config panels fix: move textParsing back into ocr tool_resource block for now so that it doesn't interfere with other upload types * 📤 feat: Add RAG API Endpoint Support for Text Parsing (#8849) * feat: implement RAG API integration for text parsing with fallback to native parsing * chore: remove TODO now that placeholder and fllback are implemented * ✈️ refactor: Migrate Text Parsing to TS (#8892) * refactor: move generateShortLivedToken to packages/api * refactor: move textParsing logic into packages/api * refactor: reduce nesting and dry code with createTextFile * fix: add proper source handling * fix: mock new parseText and parseTextNative functions in jest file * ci: add test coverage for textParser * 💬 feat: Add Audio File Support to Upload as Text (#8893) * feat: add STT support for Upload as Text * refactor: move processAudioFile to packages/api * refactor: move textParsing from utils to files * fix: remove audio/mp3 from unsupported mimetypes test since it is now supported * ✂️ feat: Configurable File Token Limits and Truncation (#8911) * feat: add configurable fileTokenLimit default value * fix: add stt to fileConfig merge logic * fix: add fileTokenLimit to mergeFileConfig logic so configurable value is actually respected from yaml * feat: add token limiting to parsed text files * fix: add extraction logic and update tests so fileTokenLimit isnt sent to LLM providers * fix: address comments * refactor: rename textTokenLimiter.ts to text.ts * chore: update form-data package to address CVE-2025-7783 and update package-lock * feat: use default supported mime types for ocr on frontend file validation * fix: should be using logger.debug not console.debug * fix: mock existsSync in text.spec.ts * fix: mock logger rather than every one of its function calls * fix: reorganize imports and streamline file upload processing logic * refactor: update createTextFile function to use destructured parameters and improve readability * chore: update file validation to use EToolResources for improved type safety * chore: update import path for types in audio processing module * fix: update file configuration access and replace console.debug with logger.debug for improved logging --------- Co-authored-by: Dustin Healy <dustinhealy1@gmail.com> Co-authored-by: Dustin Healy <54083382+dustinhealy@users.noreply.github.com>	2025-08-27 03:44:39 -04:00
Danny Avila	9a210971f5	🛜 refactor: Streamline App Config Usage (#9234 ) * WIP: app.locals refactoring WIP: appConfig fix: update memory configuration retrieval to use getAppConfig based on user role fix: update comment for AppConfig interface to clarify purpose 🏷️ refactor: Update tests to use getAppConfig for endpoint configurations ci: Update AppService tests to initialize app config instead of app.locals ci: Integrate getAppConfig into remaining tests refactor: Update multer storage destination to use promise-based getAppConfig and improve error handling in tests refactor: Rename initializeAppConfig to setAppConfig and update related tests ci: Mock getAppConfig in various tests to provide default configurations refactor: Update convertMCPToolsToPlugins to use mcpManager for server configuration and adjust related tests chore: rename `Config/getAppConfig` -> `Config/app` fix: streamline OpenAI image tools configuration by removing direct appConfig dependency and using function parameters chore: correct parameter documentation for imageOutputType in ToolService.js refactor: remove `getCustomConfig` dependency in config route refactor: update domain validation to use appConfig for allowed domains refactor: use appConfig registration property chore: remove app parameter from AppService invocation refactor: update AppConfig interface to correct registration and turnstile configurations refactor: remove getCustomConfig dependency and use getAppConfig in PluginController, multer, and MCP services refactor: replace getCustomConfig with getAppConfig in STTService, TTSService, and related files refactor: replace getCustomConfig with getAppConfig in Conversation and Message models, update tempChatRetention functions to use AppConfig type refactor: update getAppConfig calls in Conversation and Message models to include user role for temporary chat expiration ci: update related tests refactor: update getAppConfig call in getCustomConfigSpeech to include user role fix: update appConfig usage to access allowedDomains from actions instead of registration refactor: enhance AppConfig to include fileStrategies and update related file strategy logic refactor: update imports to use normalizeEndpointName from @librechat/api and remove redundant definitions chore: remove deprecated unused RunManager refactor: get balance config primarily from appConfig refactor: remove customConfig dependency for appConfig and streamline loadConfigModels logic refactor: remove getCustomConfig usage and use app config in file citations refactor: consolidate endpoint loading logic into loadEndpoints function refactor: update appConfig access to use endpoints structure across various services refactor: implement custom endpoints configuration and streamline endpoint loading logic refactor: update getAppConfig call to include user role parameter refactor: streamline endpoint configuration and enhance appConfig usage across services refactor: replace getMCPAuthMap with getUserMCPAuthMap and remove unused getCustomConfig file refactor: add type annotation for loadedEndpoints in loadEndpoints function refactor: move /services/Files/images/parse to TS API chore: add missing FILE_CITATIONS permission to IRole interface refactor: restructure toolkits to TS API refactor: separate manifest logic into its own module refactor: consolidate tool loading logic into a new tools module for startup logic refactor: move interface config logic to TS API refactor: migrate checkEmailConfig to TypeScript and update imports refactor: add FunctionTool interface and availableTools to AppConfig refactor: decouple caching and DB operations from AppService, make part of consolidated `getAppConfig` WIP: fix tests * fix: rebase conflicts * refactor: remove app.locals references * refactor: replace getBalanceConfig with getAppConfig in various strategies and middleware * refactor: replace appConfig?.balance with getBalanceConfig in various controllers and clients * test: add balance configuration to titleConvo method in AgentClient tests * chore: remove unused `openai-chat-tokens` package * chore: remove unused imports in initializeMCPs.js * refactor: update balance configuration to use getAppConfig instead of getBalanceConfig * refactor: integrate configMiddleware for centralized configuration handling * refactor: optimize email domain validation by removing unnecessary async calls * refactor: simplify multer storage configuration by removing async calls * refactor: reorder imports for better readability in user.js * refactor: replace getAppConfig calls with req.config for improved performance * chore: replace getAppConfig calls with req.config in tests for centralized configuration handling * chore: remove unused override config * refactor: add configMiddleware to endpoint route and replace getAppConfig with req.config * chore: remove customConfig parameter from TTSService constructor * refactor: pass appConfig from request to processFileCitations for improved configuration handling * refactor: remove configMiddleware from endpoint route and retrieve appConfig directly in getEndpointsConfig if not in `req.config` * test: add mockAppConfig to processFileCitations tests for improved configuration handling * fix: pass req.config to hasCustomUserVars and call without await after synchronous refactor * fix: type safety in useExportConversation * refactor: retrieve appConfig using getAppConfig in PluginController and remove configMiddleware from plugins route, to avoid always retrieving when plugins are cached * chore: change `MongoUser` typedef to `IUser` * fix: Add `user` and `config` fields to ServerRequest and update JSDoc type annotations from Express.Request to ServerRequest * fix: remove unused setAppConfig mock from Server configuration tests	2025-08-26 12:10:18 -04:00
Danny Avila	33834cd484	🧹 feat: Automatic File Cleanup for Mistral OCR Uploads (#8827 ) * chore: Handle optional token_endpoint in OAuth metadata discovery * chore: Simplify permission typing logic in checkAccess function * feat: Implement `deleteMistralFile` function and integrate file cleanup in `uploadMistralOCR`	2025-08-03 17:11:14 -04:00
Danny Avila	7e37211458	🗝️ refactor: `loadServiceKey` to Support Stringified JSON and Env Var Renaming (#8317 ) * feat: Enhance loadServiceKey to support stringified JSON input * chore: Update GOOGLE_SERVICE_KEY_FILE_PATH to GOOGLE_SERVICE_KEY_FILE for consistency	2025-07-08 21:07:33 -04:00
Danny Avila	59d00e99f3	🔍 feat: Fetch Google Service Key and Consolidate Key Loading Logic (#8179 )	2025-07-01 22:37:29 -04:00
Danny Avila	20100e120b	🔑 feat: Set Google Service Key File Path (#8130 )	2025-06-29 17:09:37 -04:00
Danny Avila	3f3cfefc52	🗒️ feat: Add Google Vertex AI Mistral OCR Strategy (#8125 ) * Implemented new uploadGoogleVertexMistralOCR function for processing OCR using Google Vertex AI. * Added vertexMistralOCRStrategy to handle file uploads. * Updated FileSources and OCRStrategy enums to include vertexai_mistral_ocr. * Introduced helper functions for JWT creation and Google service account configuration loading.	2025-06-28 13:26:03 -04:00
Danny Avila	d39b99971f	🧠 fix: Agent Title Config & Resource Handling (#8028 ) * 🔧 fix: enhance client options handling in AgentClient and set default recursion limit - Updated the recursion limit to default to 25 if not specified in agentsEConfig. - Enhanced client options in AgentClient to include model parameters such as apiKey and anthropicApiUrl from agentModelParams. - Updated requestOptions in the anthropic endpoint to use reverseProxyUrl as anthropicApiUrl. * Enhance LLM configuration tests with edge case handling * chore add return type annotation for getCustomEndpointConfig function * fix: update modelOptions handling to use optional chaining and default to empty object in multiple endpoint initializations * chore: update @librechat/agents to version 2.4.42 * refactor: streamline agent endpoint configuration and enhance client options handling for title generations - Introduced a new `getProviderConfig` function to centralize provider configuration logic. - Updated `AgentClient` to utilize the new provider configuration, improving clarity and maintainability. - Removed redundant code related to endpoint initialization and model parameter handling. - Enhanced error logging for missing endpoint configurations. * fix: add abort handling for image generation and editing in OpenAIImageTools * ci: enhance getLLMConfig tests to verify fetchOptions and dispatcher properties * fix: use optional chaining for endpointOption properties in getOptions * fix: increase title generation timeout from 25s to 45s, pass `endpointOption` to `getOptions` * fix: update file filtering logic in getToolFilesByIds to ensure text field is properly checked * fix: add error handling for empty OCR results in uploadMistralOCR and uploadAzureMistralOCR * fix: enhance error handling in file upload to include 'No OCR result' message * chore: update error messages in uploadMistralOCR and uploadAzureMistralOCR * fix: enhance filtering logic in getToolFilesByIds to include context checks for OCR resources to only include files directly attached to agent --------- Co-authored-by: Matt Burnett <matt.burnett@shopify.com>	2025-06-23 19:44:24 -04:00
Danny Avila	0103b4b08a	🧹 chore: Cleanup base64 Handling for Azure Mistral OCR (#7892 ) * 🧹 chore: Remove Comments and Cleanup base64 handling for Azure Mistral OCR * chore: Remove unnecessary await from MCP instructions formatting in AgentClient * ci: Update document_url regex in MistralOCR tests to support PDF format	2025-06-13 18:17:25 -04:00
Danny Avila	5f2d1c5dc9	👁️ feat: Azure Mistral OCR Strategy (#7888 ) * 👁️ feat: Add Azure Mistral OCR strategy and endpoint integration This commit introduces a new OCR strategy named 'azure_mistral_ocr', allowing the use of a Mistral OCR endpoint deployed on Azure. The configuration, schemas, and file upload strategies have been updated to support this integration, enabling seamless OCR processing via Azure-hosted Mistral services. * 🗑️ chore: Clean up .gitignore by removing commented-out uncommon directory name * chore: remove unused vars * refactor: Move createAxiosInstance to packages/api/utils and update imports - Removed the createAxiosInstance function from the config module and relocated it to a new utils module for better organization. - Updated import paths in relevant files to reflect the new location of createAxiosInstance. - Added tests for createAxiosInstance to ensure proper functionality and proxy configuration handling. * chore: move axios helpers to packages/api - Added logAxiosError function to @librechat/api for centralized error logging. - Updated imports across various files to use the new logAxiosError function. - Removed the old axios.js utility file as it is no longer needed. * chore: Update Jest moduleNameMapper for improved path resolution - Added a new mapping for '~/' to resolve module paths in Jest configuration, enhancing import handling for the project. * feat: Implement Mistral OCR API integration in TS * chore: Update MistralOCR tests based on new imports * fix: Enhance MistralOCR configuration handling and tests - Introduced helper functions for resolving configuration values from environment variables or hardcoded settings. - Updated the uploadMistralOCR and uploadAzureMistralOCR functions to utilize the new configuration resolution logic. - Improved test cases to ensure correct behavior when mixing environment variables and hardcoded values. - Mocked file upload and signed URL responses in tests to validate functionality without external dependencies. * feat: Enhance MistralOCR functionality with improved configuration and error handling - Introduced helper functions for loading authentication configuration and resolving values from environment variables. - Updated uploadMistralOCR and uploadAzureMistralOCR functions to utilize the new configuration logic. - Added utility functions for processing OCR results and creating error messages. - Improved document type determination and result aggregation for better OCR processing. * refactor: Reorganize OCR type imports in Mistral CRUD file - Moved OCRResult, OCRResultPage, and OCRImage imports to a more logical grouping for better readability and maintainability. * feat: Add file exports to API and create files index * chore: Update OCR types for enhanced structure and clarity - Redesigned OCRImage interface to include mandatory fields and improved naming conventions. - Added PageDimensions interface for better representation of page metrics. - Updated OCRResultPage to include dimensions and mandatory images array. - Refined OCRResult to include document annotation and usage information. * refactor: use TS counterpart of uploadOCR methods * ci: Update MistralOCR tests to reflect new OCR result structure * chore: Bump version of @librechat/api to 1.2.3 in package.json and package-lock.json * chore: Update CONFIG_VERSION to 1.2.8 * chore: remove unused sendEvent function from config module (now imported from '@librechat/api') * chore: remove MistralOCR service files and tests (now in '@librechat/api') * ci: update logger import in ModelService tests to use @librechat/data-schemas --------- Co-authored-by: arthurolivierfortin <arthurolivier.fortin@gmail.com>	2025-06-13 15:14:57 -04:00

33 commits