* security: replace JSZip metadata guard with yauzl streaming decompression
The ODT decompressed-size guard was checking JSZip's private
_data.uncompressedSize fields, which are populated from the ZIP central
directory — attacker-controlled metadata. A crafted ODT with falsified
uncompressedSize values bypassed the 50MB cap entirely, allowing
content.xml decompression to exhaust Node.js heap memory (DoS).
Replace JSZip with yauzl for ODT extraction. The new extractOdtContentXml
function uses yauzl's streaming API: it lazily iterates ZIP entries,
opens a decompression stream for content.xml, and counts real bytes as
they arrive from the inflate stream. The stream is destroyed the moment
the byte count crosses ODT_MAX_DECOMPRESSED_SIZE, aborting the inflate
before the full payload is materialised in memory.
- Remove jszip from direct dependencies (still transitive via mammoth)
- Add yauzl + @types/yauzl
- Update zip-bomb test to verify streaming abort with DEFLATE payload
* fix: close file descriptor leaks and declare jszip test dependency
- Use a shared `finish()` helper in extractOdtContentXml that calls
zipfile.close() on every exit path (success, size cap, missing entry,
openReadStream errors, zipfile errors). Without this, any error path
leaked one OS file descriptor permanently — uploading many malformed
ODTs could exhaust the process FD limit (a distinct DoS vector).
- Add jszip to devDependencies so the zip-bomb test has an explicit
dependency rather than relying on mammoth's transitive jszip.
- Update JSDoc to document that all exit paths close the zipfile.
* fix: move yauzl from dependencies to peerDependencies
Matches the established pattern for runtime parser libraries in
packages/api: mammoth, pdfjs-dist, and xlsx are all peerDependencies
(provided by the consuming /api workspace) with devDependencies for
testing. yauzl was incorrectly placed in dependencies.
* fix: add yauzl to /api dependencies to satisfy peer dep
packages/api declares yauzl as a peerDependency; /api is the consuming
workspace that must provide it at runtime, matching the pattern used
for mammoth, pdfjs-dist, and xlsx.
* fix: add ODT support to native document parser
* fix: replace execSync with jszip for ODT parsing
* docs: update documentParserMimeTypes comment to include odt
* fix: improve ODT XML extraction and add empty.odt fixture
- Scope extraction to <office:body> to exclude metadata/style nodes
- Map </text:p> and </text:h> closings to newlines, preserving paragraph
structure instead of collapsing everything to a single line
- Handle <text:line-break/> as explicit newlines
- Strip remaining tags, normalize horizontal whitespace, cap consecutive
blank lines at one
- Regenerate sample.odt as a two-paragraph fixture so the test exercises
multi-paragraph output
- Add empty.odt fixture and test asserting 'No text found in document'
* fix: address review findings in ODT parser
- Use static `import JSZip from 'jszip'` instead of dynamic import;
jszip is CommonJS-only with no ESM/Jest-isolation concern (F1)
- Decode the five standard XML entities after tag-stripping so
documents with &, <, >, ", ' send correct text to the LLM (F2)
- Remove @types/jszip devDependency; jszip ships bundled declarations
and @types/jszip is a stale 2020 stub that would shadow them (F3)
- Handle <text:tab/> → \t and <text:s .../> → ' ' before the generic
tag stripper so tab-aligned and multi-space content is preserved (F4)
- Add sample-entities.odt fixture and test covering entity decoding,
tab, and spacing-element handling (F5)
- Rename 'throws for empty odt' → 'throws for odt with no extractable
text' to distinguish from a zero-byte/corrupt file case (F8)
* fix: add decompressed content size cap to odtToText (F6)
Reads uncompressed entry sizes from the JSZip internal metadata before
extracting any content. Throws if the total exceeds 50MB, preventing a
crafted ODT with a high-ratio compressed payload from exhausting heap.
Adds a corresponding test using a real DEFLATE-compressed ZIP (~51KB on
disk, 51MB uncompressed) to verify the guard fires before any extraction.
* fix: add java to codeTypeMapping for file upload support
.java files were rejected with "Unable to determine file type" because
browsers send an empty MIME type for them and codeTypeMapping had no
'java' entry for inferMimeType() to fall back on.
text/x-java was already present in all five validation lists
(fullMimeTypesList, codeInterpreterMimeTypesList, retrievalMimeTypesList,
textMimeTypes, retrievalMimeTypes), so mapping to it (not text/plain)
ensures .java uploads work for both File Search and Code Interpreter.
Closes#12307
* fix: address follow-up review findings (A-E)
A: regenerate package-lock.json after removing @types/jszip from
package.json; without this npm ci was still installing the stale
2020 type stubs and TypeScript was resolving against them
B: replace dynamic import('jszip') in the zip-bomb test with the same
static import already used in production; jszip is CJS-only with no
ESM/Jest isolation concern
C: document that the _data.uncompressedSize guard fails open if jszip
renames the private field (accepted limitation, test would catch it)
D: rename 'preserves tabs' test to 'normalizes tab and spacing elements
to spaces' since <text:tab> is collapsed to a space, not kept as \t
E: fix test.each([ formatting artifact (missing newline after '[')
---------
Co-authored-by: Danny Avila <danny@librechat.ai>
Prevent memory exhaustion DoS by rejecting documents exceeding 15MB
before reading them into memory, closing the gap between the 512MB
upload limit and unbounded in-memory parsing.
* fix: add agent permission check to image upload route
* refactor: remove unused SystemRoles import and format test file for clarity
* fix: address review findings for image upload agent permission check
* refactor: move agent upload auth logic to TypeScript in packages/api
Extract pure authorization logic from agentPermCheck.js into
checkAgentUploadAuth() in packages/api/src/files/agentUploadAuth.ts.
The function returns a structured result ({ allowed, status, error })
instead of writing HTTP responses directly, eliminating the dual
responsibility and confusing sentinel return value. The JS wrapper
in /api is now a thin adapter that translates the result to HTTP.
* test: rewrite image upload permission tests as integration tests
Replace mock-heavy images-agent-perm.spec.js with integration tests
using MongoMemoryServer, real models, and real PermissionService.
Follows the established pattern in files.agents.test.js. Moves test
to sibling location (images.agents.test.js) matching backend convention.
Adds temp file cleanup assertions on 403/404 responses and covers
message_file exemption paths (boolean true, string "true", false).
* fix: widen AgentUploadAuthDeps types to accept ObjectId from Mongoose
The injected getAgent returns Mongoose documents where _id and author
are Types.ObjectId at runtime, not string. Widen the DI interface to
accept string | Types.ObjectId for _id, author, and resourceId so the
contract accurately reflects real callers.
* chore: move agent upload auth into files/agents/ subdirectory
* refactor: delete agentPermCheck.js wrapper, move verifyAgentUploadPermission to packages/api
The /api-only dependencies (getAgent, checkPermission) are now passed
as object-field params from the route call sites. Both images.js and
files.js import verifyAgentUploadPermission from @librechat/api and
inject the deps directly, eliminating the intermediate JS wrapper.
* style: fix import type ordering in agent upload auth
* fix: prevent token TTL race in MCPTokenStorage.storeTokens
When expires_in is provided, use it directly instead of round-tripping
through Date arithmetic. The previous code computed accessTokenExpiry
as a Date, then after an async encryptV2 call, recomputed expiresIn by
subtracting Date.now(). On loaded CI runners the elapsed time caused
Math.floor to truncate to 0, triggering the 1-year fallback and making
the token appear permanently valid — so refresh never fired.
* 🔧 fix: Update Excel sheet parsing to use fs.promises.readFile and correct import for xlsx
- Modified the excelSheetToText function to read the file using fs.promises.readFile instead of directly accessing the file path.
- Updated the import statement for the xlsx library to use the correct read method, ensuring proper functionality in parsing Excel sheets.
* 🔧 fix: Update document parsing methods to use buffer for file reading
- Modified the wordDocToText function to read the file as a buffer using fs.promises.readFile, ensuring compatibility with the mammoth library.
- Updated the excelSheetToText function to read the Excel file as a buffer, addressing issues with the xlsx library's handling of dynamic imports and file access.
* feat: Add tests for empty xlsx document parsing and validate xlsx imports
- Introduced a new test case to verify that the `parseDocument` function correctly handles an empty xlsx file with only a sheet name, ensuring it returns the expected document structure.
- Added a test to confirm that the `xlsx` library exports `read` and `utils` as named imports, validating the functionality of the library integration.
- Included a new empty xlsx file to support the test cases.
* ✨ feat: Add support for OpenDocument MIME types in file configuration
Updated the applicationMimeTypes regex to include support for OASIS OpenDocument formats, enhancing the file type recognition capabilities of the data provider.
* feat: document processing with OpenDocument support
Added support for OpenDocument Spreadsheet (ODS) MIME type in the file processing service and updated the document parser to handle ODS files. Included tests to verify correct parsing of ODS documents and updated file configuration to recognize OpenDocument formats.
* refactor: Enhance document processing to support additional Excel MIME types
Updated the document processing logic to utilize a regex for matching Excel MIME types, improving flexibility in handling various Excel file formats. Added tests to ensure correct parsing of new MIME types, including multiple Excel variants and OpenDocument formats. Adjusted file configuration to include these MIME types for better recognition in the file processing service.
* feat: Add support for additional OpenDocument MIME types in file processing
Enhanced the document processing service to support ODT, ODP, and ODG MIME types. Updated tests to verify correct routing through the OCR strategy for these new formats. Adjusted documentation to reflect changes in handled MIME types for improved clarity.
* feat: add aws bedrock upload to provider support
* chore: address copilot comments
* feat: add shared Bedrock document format types and MIME mapping
Bedrock Converse API accepts 9 document formats beyond PDF. Add
BedrockDocumentFormat union type, MIME-to-format mapping, and helpers
in data-provider so both client and backend can reference them.
* refactor: generalize Bedrock PDF validation to support all document types
Rename validateBedrockPdf to validateBedrockDocument with MIME-aware
logic: 4.5MB hard limit applies to all types, PDF header check only
runs for application/pdf. Adds test coverage for non-PDF documents.
* feat: support all Bedrock document formats in encoding pipeline
Widen file type gates to accept csv, doc, docx, xls, xlsx, html, txt,
md for Bedrock. Uses shared MIME-to-format map instead of hardcoded
'pdf'. Other providers' PDF-only paths remain unchanged.
* feat: expand Bedrock file upload UI to accept all document types
Add 'image_document_extended' upload type for Bedrock with accept
filters for all 9 supported formats. Update drag-and-drop validation
to use isBedrockDocumentType helper.
* fix: route Bedrock document types through provider pipeline
* feat: Added "document parser" OCR strategy
The document parser uses libraries to parse the text out of known document types.
This lets LibreChat handle some complex document types without having to use a
secondary service (like Mistral or standing up a RAG API server).
To enable the document parser, set the ocr strategy to "document_parser" in
librechat.yaml.
We now support:
- PDFs using pdfjs
- DOCX using mammoth
- XLS/XLSX using SheetJS
(The associated packages were also added to the project.)
* fix: applied Copilot code review suggestions
- Properly calculate length of text based on UTF8.
- Avoid issues with loading / blocking PDF parsing.
* fix: improved docs on parseDocument()
* chore: move to packages/api for TS support
* refactor: make document processing the default ocr strategy
- Introduced support for additional document types in the OCR strategy, including PDF, DOCX, and XLS/XLSX.
- Updated the file upload handling to dynamically select the appropriate parsing strategy based on the file type.
- Refactored the document parsing functions to use asynchronous imports for improved performance and maintainability.
* test: add unit tests for processAgentFileUpload functionality
- Introduced a new test suite for the processAgentFileUpload function in process.spec.js.
- Implemented various test cases to validate OCR strategy selection based on file types, including PDF, DOCX, XLSX, and XLS.
- Mocked dependencies to ensure isolated testing of file upload handling and strategy selection logic.
- Enhanced coverage for scenarios involving OCR capability checks and default strategy fallbacks.
* chore: update pdfjs-dist version and enhance document parsing tests
- Bumped pdfjs-dist dependency to version 5.4.624 in both api and packages/api.
- Refactored document parsing tests to use 'originalname' instead of 'filename' for file objects.
- Added a new test case for parsing XLS files to improve coverage of document types supported by the parser.
- Introduced a sample XLS file for testing purposes.
* feat: enforce text size limit and improve OCR fallback handling in processAgentFileUpload
- Added a check to ensure extracted text does not exceed the 15MB storage limit, throwing an error if it does.
- Refactored the OCR handling logic to improve fallback behavior when the configured OCR fails, ensuring a more robust document processing flow.
- Enhanced unit tests to cover scenarios for oversized text and fallback mechanisms, ensuring proper error handling and functionality.
* fix: correct OCR URL construction in performOCR function
- Updated the OCR URL construction to ensure it correctly appends '/ocr' to the base URL if not already present, improving the reliability of the OCR request.
---------
Co-authored-by: Dan Lew <daniel@mightyacorn.com>
* Added video upload support for OpenRouter
- Added VIDEO_URL content type to support video_url message format
- Implemented OpenRouter video encoding using base64 data URLs
- Extended encodeAndFormatVideos() to handle OpenRouter provider
- Updated UI to accept video uploads for OpenRouter (mp4, webm, mpeg, mov)
- Fixed case-sensitivity in provider detection for agents
- Made isDocumentSupportedProvider() and isOpenAILikeProvider() case-insensitive
Videos are now converted to data:video/mp4;base64,... format compatible
with OpenRouter's API requirements per their documentation.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* refactor: change multimodal and google_multimodal to more transparent variable names of image_document and image_document_video_audio
(also google_multimodal doesn't apply as much since we are adding support for video and audio uploads for open router)
* fix: revert .toLowerCase change to isOpenAILikeProvider and isDocumentSupportedProvider which broke upload to provider detection for openAI endpoints
* wip: add audio support to openrouter
* fix: filetypes now properly parsed and sent rather than destructured mimetypes for openrouter
* refactor: Omit to Exclude for ESLint
* feat: update DragDropModal for new openrouter support
* fix: special case openrouter for lower case provider
(currently getting issues with the provider coming in as 'OpenRouter' and our enum being 'openrouter') This will probably require a larger refactor later to handle case insensitivity for all providers, but that will have to be thoroughly tested in its own isolated PR
---------
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Dustin Healy <54083382+dustinhealy@users.noreply.github.com>
* fix: increase RAG API text parsing timeout for large files
* ci: Update text.spec.ts
---------
Co-authored-by: Rosen Simov <rosen.simov@endurosat.com>
Co-authored-by: Danny Avila <danny@librechat.ai>
* fix: change google multimodal attachments to use type: 'media'
* chore: Update @librechat/agents to version 3.0.27 in package.json and package-lock.json
---------
Co-authored-by: Danny Avila <danny@librechat.ai>
* refactor: add image file size validation as part of payload build
* feat: implement file size and MIME type filtering in endpoint configuration
* chore: import order
* feat: add filterFilesByEndpointConfig to filter disabled file processing by provider
* chore: explicit define of endpointFileConfig for better debugging
* refactor: move `normalizeEndpointName` to data-provider as used app-wide
* chore: remove overrideEndpoint from useFileHandling
* refactor: improve endpoint file config selection
* refactor: update filterFilesByEndpointConfig to accept structured parameters and improve endpoint file config handling
* refactor: replace defaultFileConfig with getEndpointFileConfig for improved file configuration handling across components
* test: add comprehensive unit tests for getEndpointFileConfig to validate endpoint configuration handling
* refactor: streamline agent endpoint assignment and improve file filtering logic
* feat: add error handling for disabled file uploads in endpoint configuration
* refactor: update encodeAndFormat functions to accept structured parameters for provider and endpoint
* refactor: streamline requestFiles handling in initializeAgent function
* fix: getEndpointFileConfig partial config merging scenarios
* refactor: enhance mergeWithDefault function to support document-supported providers with comprehensive MIME types
* refactor: user-configured default file config in getEndpointFileConfig
* fix: prevent file handling when endpoint is disabled and file is dragged to chat
* refactor: move `getEndpointField` to `data-provider` and update usage across components and hooks
* fix: prioritize endpointType based on agent.endpoint in file filtering logic
* fix: prioritize agent.endpoint in file filtering logic and remove unnecessary endpointType defaulting
* chore: correct type for ServerRequest
* chore: improve ServerRequest typing across several modules
* feat: Add PDF configured limit validation
- Introduced comprehensive tests for PDF validation across multiple providers, ensuring correct behavior for file size limits and edge cases.
- Enhanced the `validatePdf` function to accept an optional configured file size limit, allowing for stricter validation based on user configurations.
- Updated related functions to utilize the new validation logic, ensuring consistent behavior across different providers.
* chore: Update Request type to ServerRequest in audio and video encoding modules
* refactor: move `getConfiguredFileSizeLimit` utility
* feat: Add video and audio validation with configurable size limits
- Introduced `validateVideo` and `validateAudio` functions to validate media files against provider-specific size limits.
- Enhanced validation logic to consider optional configured file size limits, allowing for more flexible file handling.
- Added comprehensive tests for video and audio validation across different providers, ensuring correct behavior for various scenarios.
* refactor: Update PDF and media validation to allow higher configured limits
- Modified validation logic to accept user-configured file size limits that exceed provider defaults, ensuring correct acceptance of files within the specified range.
- Updated tests to reflect changes in validation behavior, confirming that files are accepted when within the configured limits.
- Enhanced documentation in tests to clarify expected outcomes with the new validation rules.
* chore: Add @types/node-fetch dependency to package.json and package-lock.json
- Included the @types/node-fetch package to enhance type definitions for node-fetch usage.
- Updated package-lock.json to reflect the addition of the new dependency.
* fix: Rename FileConfigInput to TFileConfig
* fix: Remove ephemeral cache control from document encoding function
* refactor: Improve document encoding types and add file context for anthropic messages api
- Added AnthropicDocumentBlock interface to define the structure for documents from the Anthropic provider.
- Updated encodeAndFormatDocuments function to utilize the new type and include optional context for filenames.
- Refactored DocumentResult to use a union type for various document formats, improving type safety and clarity.
- problem: `addImageUrls` had a side effect that was being leveraged before to populate both the `ocr` message field, now `fileContext`, and `client.options.attachments`, which would record the user's uploaded message attachments to the user message when saved to the database and returned at the end of the request lifecycle
- solution: created dedicated handling for file context, and made sure to populate `allFiles` with non-provider attachments
* 📎 feat: Direct Provider Attachment Support for Multimodal Content
* 📑 feat: Anthropic Direct Provider Upload (#9072)
* feat: implement Anthropic native PDF support with document preservation
- Add comprehensive debug logging throughout PDF processing pipeline
- Refactor attachment processing to separate image and document handling
- Create distinct addImageURLs(), addDocuments(), and processAttachments() methods
- Fix critical bugs in stream handling and parameter passing
- Add streamToBuffer utility for proper stream-to-buffer conversion
- Remove api/agents submodule from repository
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* chore: remove out of scope formatting changes
* fix: stop duplication of file in chat on end of response stream
* chore: bring back file search and ocr options
* chore: localize upload to provider string in file menu
* refactor: change createMenuItems args to fit new pattern introduced by anthropic-native-pdf-support
* feat: add cache point for pdfs processed by anthropic endpoint since they are unlikely to change and should benefit from caching
* feat: combine Upload Image into Upload to Provider since they both perform direct upload and change provider upload icon to reflect multimodal upload
* feat: add citations support according to docs
* refactor: remove redundant 'document' check since documents are handled properly by formatMessage in the agents repo now
* refactor: change upload logic so anthropic endpoint isn't exempted from normal upload path using Agents for consistency with the rest of the upload logic
* fix: include width and height in return from uploadLocalFile so images are correctly identified when going through an AgentUpload in addImageURLs
* chore: remove client specific handling since the direct provider stuff is handled by the agent client
* feat: handle documents in AgentClient so no need for change to agents repo
* chore: removed unused changes
* chore: remove auto generated comments from OG commit
* feat: add logic for agents to use direct to provider uploads if supported (currently just anthropic)
* fix: reintroduce role check to fix render error because of undefined value for Content Part
* fix: actually fix render bug by using proper isCreatedByUser check and making sure our mutation of formattedMessage.content is consistent
---------
Co-authored-by: Andres Restrepo <andres@thelinuxkid.com>
Co-authored-by: Claude <noreply@anthropic.com>
📁 feat: Send Attachments Directly to Provider (OpenAI) (#9098)
* refactor: change references from direct upload to direct attach to better reflect functionality
since we are just using base64 encoding strategy now rather than Files/File API for sending our attachments directly to the provider, the upload nomenclature no longer makes sense. direct_attach better describes the different methods of sending attachments to providers anyways even if we later introduce direct upload support
* feat: add upload to provider option for openai (and agent) ui
* chore: move anthropic pdf validator over to packages/api
* feat: simple pdf validation according to openai docs
* feat: add provider agnostic validatePdf logic to start handling multiple endpoints
* feat: add handling for openai specific documentPart formatting
* refactor: move require statement to proper place at top of file
* chore: add in openAI endpoint for the rest of the document handling logic
* feat: add direct attach support for azureOpenAI endpoint and agents
* feat: add pdf validation for azureOpenAI endpoint
* refactor: unify all the endpoint checks with isDocumentSupportedEndpoint
* refactor: consolidate Upload to Provider vs Upload image logic for clarity
* refactor: remove anthropic from anthropic_multimodal fileType since we support multiple providers now
🗂️ feat: Send Attachments Directly to Provider (Google) (#9100)
* feat: add validation for google PDFs and add google endpoint as a document supporting endpoint
* feat: add proper pdf formatting for google endpoints (requires PR #14 in agents)
* feat: add multimodal support for google endpoint attachments
* feat: add audio file svg
* fix: refactor attachments logic so multi-attachment messages work properly
* feat: add video file svg
* fix: allows for followup questions of uploaded multimodal attachments
* fix: remove incorrect final message filtering that was breaking Attachment component rendering
fix: manualy rename 'documents' to 'Documents' in git since it wasn't picked up due to case insensitivity in dir name
fix: add logic so filepicker for a google agent has proper filetype filtering
🛫 refactor: Move Encoding Logic to packages/api (#9182)
* refactor: move audio encode over to TS
* refactor: audio encoding now functional in LC again
* refactor: move video encode over to TS
* refactor: move document encode over to TS
* refactor: video encoding now functional in LC again
* refactor: document encoding now functional in LC again
* fix: extend file type options in AttachFileMenu to include 'google_multimodal' and update dependency array to include agent?.provider
* feat: only accept pdfs if responses api is enabled for openai convos
chore: address ESLint comments
chore: add missing audio mimetype
* fix: type safety for message content parts and improve null handling
* chore: reorder AttachFileMenuProps for consistency and clarity
* chore: import order in AttachFileMenu
* fix: improve null handling for text parts in parseTextParts function
* fix: remove no longer used unsupported capability error message for file uploads
* fix: OpenAI Direct File Attachment Format
* fix: update encodeAndFormatDocuments to support OpenAI responses API and enhance document result types
* refactor: broaden providers supported for documents
* feat: enhance DragDrop context and modal to support document uploads based on provider capabilities
* fix: reorder import statements for consistency in video encoding module
---------
Co-authored-by: Dustin Healy <54083382+dustinhealy@users.noreply.github.com>
* refactor: move `loadOCRConfig` from `packages/data-provider` to `packages/api` and return `undefined` if not explicitly configured
* fix: loadOCRConfig import from @librechat/api
* refactor: update defaultTextMimeTypes to support virtually all file types for text parsing
* fix: improve OCR capability check and error message for unsupported file types
* ci: remove unnecessary ocr expectation from AppService test
* fix: axios response logging for text parsing, remove console logging, remove jsdoc
* refactor: error logging in logAxiosError function to handle various error types with type guards
* refactor: enhance text parsing with improved error handling and async file reading
* refactor: replace synchronous file reading with asynchronous methods for improved performance and memory management
* ci: update tests
* 💻 feat: Add proxy configuration support for Mistral OCR API requests
* refactor: Implement proxy support for Mistral API requests using HttpsProxyAgent
* 🪶 feat: Add Support for Uploading Plaintext Files
feat: delineate between OCR and text handling in fileConfig field of config file
- also adds support for passing in mimetypes as just plain file extensions
feat: add showLabel bool to support future synthetic component DynamicDropdownInput
feat: add new combination dropdown-input component in params panel to support file type token limits
refactor: move hovercard to side to align with other hovercards
chore: clean up autogenerated comments
feat: add delineation to file upload path between text and ocr configured filetypes
feat: add token limit checks during file upload
refactor: move textParsing out of ocrEnabled logic
refactor: clean up types for filetype config
refactor: finish decoupling DynamicDropdownInput from fileTokenLimits
fix: move image token cost function into file to fix circular dependency causing unittest to fail and remove unused var for linter
chore: remove out of scope code following review
refactor: make fileTokenLimit conform to existing styles
chore: remove unused localization string
chore: undo changes to DynamicInput and other strays
feat: add fileTokenLimit to all provider config panels
fix: move textParsing back into ocr tool_resource block for now so that it doesn't interfere with other upload types
* 📤 feat: Add RAG API Endpoint Support for Text Parsing (#8849)
* feat: implement RAG API integration for text parsing with fallback to native parsing
* chore: remove TODO now that placeholder and fllback are implemented
* ✈️ refactor: Migrate Text Parsing to TS (#8892)
* refactor: move generateShortLivedToken to packages/api
* refactor: move textParsing logic into packages/api
* refactor: reduce nesting and dry code with createTextFile
* fix: add proper source handling
* fix: mock new parseText and parseTextNative functions in jest file
* ci: add test coverage for textParser
* 💬 feat: Add Audio File Support to Upload as Text (#8893)
* feat: add STT support for Upload as Text
* refactor: move processAudioFile to packages/api
* refactor: move textParsing from utils to files
* fix: remove audio/mp3 from unsupported mimetypes test since it is now supported
* ✂️ feat: Configurable File Token Limits and Truncation (#8911)
* feat: add configurable fileTokenLimit default value
* fix: add stt to fileConfig merge logic
* fix: add fileTokenLimit to mergeFileConfig logic so configurable value is actually respected from yaml
* feat: add token limiting to parsed text files
* fix: add extraction logic and update tests so fileTokenLimit isnt sent to LLM providers
* fix: address comments
* refactor: rename textTokenLimiter.ts to text.ts
* chore: update form-data package to address CVE-2025-7783 and update package-lock
* feat: use default supported mime types for ocr on frontend file validation
* fix: should be using logger.debug not console.debug
* fix: mock existsSync in text.spec.ts
* fix: mock logger rather than every one of its function calls
* fix: reorganize imports and streamline file upload processing logic
* refactor: update createTextFile function to use destructured parameters and improve readability
* chore: update file validation to use EToolResources for improved type safety
* chore: update import path for types in audio processing module
* fix: update file configuration access and replace console.debug with logger.debug for improved logging
---------
Co-authored-by: Dustin Healy <dustinhealy1@gmail.com>
Co-authored-by: Dustin Healy <54083382+dustinhealy@users.noreply.github.com>
* WIP: app.locals refactoring
WIP: appConfig
fix: update memory configuration retrieval to use getAppConfig based on user role
fix: update comment for AppConfig interface to clarify purpose
🏷️ refactor: Update tests to use getAppConfig for endpoint configurations
ci: Update AppService tests to initialize app config instead of app.locals
ci: Integrate getAppConfig into remaining tests
refactor: Update multer storage destination to use promise-based getAppConfig and improve error handling in tests
refactor: Rename initializeAppConfig to setAppConfig and update related tests
ci: Mock getAppConfig in various tests to provide default configurations
refactor: Update convertMCPToolsToPlugins to use mcpManager for server configuration and adjust related tests
chore: rename `Config/getAppConfig` -> `Config/app`
fix: streamline OpenAI image tools configuration by removing direct appConfig dependency and using function parameters
chore: correct parameter documentation for imageOutputType in ToolService.js
refactor: remove `getCustomConfig` dependency in config route
refactor: update domain validation to use appConfig for allowed domains
refactor: use appConfig registration property
chore: remove app parameter from AppService invocation
refactor: update AppConfig interface to correct registration and turnstile configurations
refactor: remove getCustomConfig dependency and use getAppConfig in PluginController, multer, and MCP services
refactor: replace getCustomConfig with getAppConfig in STTService, TTSService, and related files
refactor: replace getCustomConfig with getAppConfig in Conversation and Message models, update tempChatRetention functions to use AppConfig type
refactor: update getAppConfig calls in Conversation and Message models to include user role for temporary chat expiration
ci: update related tests
refactor: update getAppConfig call in getCustomConfigSpeech to include user role
fix: update appConfig usage to access allowedDomains from actions instead of registration
refactor: enhance AppConfig to include fileStrategies and update related file strategy logic
refactor: update imports to use normalizeEndpointName from @librechat/api and remove redundant definitions
chore: remove deprecated unused RunManager
refactor: get balance config primarily from appConfig
refactor: remove customConfig dependency for appConfig and streamline loadConfigModels logic
refactor: remove getCustomConfig usage and use app config in file citations
refactor: consolidate endpoint loading logic into loadEndpoints function
refactor: update appConfig access to use endpoints structure across various services
refactor: implement custom endpoints configuration and streamline endpoint loading logic
refactor: update getAppConfig call to include user role parameter
refactor: streamline endpoint configuration and enhance appConfig usage across services
refactor: replace getMCPAuthMap with getUserMCPAuthMap and remove unused getCustomConfig file
refactor: add type annotation for loadedEndpoints in loadEndpoints function
refactor: move /services/Files/images/parse to TS API
chore: add missing FILE_CITATIONS permission to IRole interface
refactor: restructure toolkits to TS API
refactor: separate manifest logic into its own module
refactor: consolidate tool loading logic into a new tools module for startup logic
refactor: move interface config logic to TS API
refactor: migrate checkEmailConfig to TypeScript and update imports
refactor: add FunctionTool interface and availableTools to AppConfig
refactor: decouple caching and DB operations from AppService, make part of consolidated `getAppConfig`
WIP: fix tests
* fix: rebase conflicts
* refactor: remove app.locals references
* refactor: replace getBalanceConfig with getAppConfig in various strategies and middleware
* refactor: replace appConfig?.balance with getBalanceConfig in various controllers and clients
* test: add balance configuration to titleConvo method in AgentClient tests
* chore: remove unused `openai-chat-tokens` package
* chore: remove unused imports in initializeMCPs.js
* refactor: update balance configuration to use getAppConfig instead of getBalanceConfig
* refactor: integrate configMiddleware for centralized configuration handling
* refactor: optimize email domain validation by removing unnecessary async calls
* refactor: simplify multer storage configuration by removing async calls
* refactor: reorder imports for better readability in user.js
* refactor: replace getAppConfig calls with req.config for improved performance
* chore: replace getAppConfig calls with req.config in tests for centralized configuration handling
* chore: remove unused override config
* refactor: add configMiddleware to endpoint route and replace getAppConfig with req.config
* chore: remove customConfig parameter from TTSService constructor
* refactor: pass appConfig from request to processFileCitations for improved configuration handling
* refactor: remove configMiddleware from endpoint route and retrieve appConfig directly in getEndpointsConfig if not in `req.config`
* test: add mockAppConfig to processFileCitations tests for improved configuration handling
* fix: pass req.config to hasCustomUserVars and call without await after synchronous refactor
* fix: type safety in useExportConversation
* refactor: retrieve appConfig using getAppConfig in PluginController and remove configMiddleware from plugins route, to avoid always retrieving when plugins are cached
* chore: change `MongoUser` typedef to `IUser`
* fix: Add `user` and `config` fields to ServerRequest and update JSDoc type annotations from Express.Request to ServerRequest
* fix: remove unused setAppConfig mock from Server configuration tests
* chore: Handle optional token_endpoint in OAuth metadata discovery
* chore: Simplify permission typing logic in checkAccess function
* feat: Implement `deleteMistralFile` function and integrate file cleanup in `uploadMistralOCR`
* feat: Enhance loadServiceKey to support stringified JSON input
* chore: Update GOOGLE_SERVICE_KEY_FILE_PATH to GOOGLE_SERVICE_KEY_FILE for consistency
* Implemented new uploadGoogleVertexMistralOCR function for processing OCR using Google Vertex AI.
* Added vertexMistralOCRStrategy to handle file uploads.
* Updated FileSources and OCRStrategy enums to include vertexai_mistral_ocr.
* Introduced helper functions for JWT creation and Google service account configuration loading.
* 🔧 fix: enhance client options handling in AgentClient and set default recursion limit
- Updated the recursion limit to default to 25 if not specified in agentsEConfig.
- Enhanced client options in AgentClient to include model parameters such as apiKey and anthropicApiUrl from agentModelParams.
- Updated requestOptions in the anthropic endpoint to use reverseProxyUrl as anthropicApiUrl.
* Enhance LLM configuration tests with edge case handling
* chore add return type annotation for getCustomEndpointConfig function
* fix: update modelOptions handling to use optional chaining and default to empty object in multiple endpoint initializations
* chore: update @librechat/agents to version 2.4.42
* refactor: streamline agent endpoint configuration and enhance client options handling for title generations
- Introduced a new `getProviderConfig` function to centralize provider configuration logic.
- Updated `AgentClient` to utilize the new provider configuration, improving clarity and maintainability.
- Removed redundant code related to endpoint initialization and model parameter handling.
- Enhanced error logging for missing endpoint configurations.
* fix: add abort handling for image generation and editing in OpenAIImageTools
* ci: enhance getLLMConfig tests to verify fetchOptions and dispatcher properties
* fix: use optional chaining for endpointOption properties in getOptions
* fix: increase title generation timeout from 25s to 45s, pass `endpointOption` to `getOptions`
* fix: update file filtering logic in getToolFilesByIds to ensure text field is properly checked
* fix: add error handling for empty OCR results in uploadMistralOCR and uploadAzureMistralOCR
* fix: enhance error handling in file upload to include 'No OCR result' message
* chore: update error messages in uploadMistralOCR and uploadAzureMistralOCR
* fix: enhance filtering logic in getToolFilesByIds to include context checks for OCR resources to only include files directly attached to agent
---------
Co-authored-by: Matt Burnett <matt.burnett@shopify.com>
* 🧹 chore: Remove Comments and Cleanup base64 handling for Azure Mistral OCR
* chore: Remove unnecessary await from MCP instructions formatting in AgentClient
* ci: Update document_url regex in MistralOCR tests to support PDF format
* 👁️ feat: Add Azure Mistral OCR strategy and endpoint integration
This commit introduces a new OCR strategy named 'azure_mistral_ocr', allowing the use of a Mistral OCR endpoint deployed on Azure. The configuration, schemas, and file upload strategies have been updated to support this integration, enabling seamless OCR processing via Azure-hosted Mistral services.
* 🗑️ chore: Clean up .gitignore by removing commented-out uncommon directory name
* chore: remove unused vars
* refactor: Move createAxiosInstance to packages/api/utils and update imports
- Removed the createAxiosInstance function from the config module and relocated it to a new utils module for better organization.
- Updated import paths in relevant files to reflect the new location of createAxiosInstance.
- Added tests for createAxiosInstance to ensure proper functionality and proxy configuration handling.
* chore: move axios helpers to packages/api
- Added logAxiosError function to @librechat/api for centralized error logging.
- Updated imports across various files to use the new logAxiosError function.
- Removed the old axios.js utility file as it is no longer needed.
* chore: Update Jest moduleNameMapper for improved path resolution
- Added a new mapping for '~/' to resolve module paths in Jest configuration, enhancing import handling for the project.
* feat: Implement Mistral OCR API integration in TS
* chore: Update MistralOCR tests based on new imports
* fix: Enhance MistralOCR configuration handling and tests
- Introduced helper functions for resolving configuration values from environment variables or hardcoded settings.
- Updated the uploadMistralOCR and uploadAzureMistralOCR functions to utilize the new configuration resolution logic.
- Improved test cases to ensure correct behavior when mixing environment variables and hardcoded values.
- Mocked file upload and signed URL responses in tests to validate functionality without external dependencies.
* feat: Enhance MistralOCR functionality with improved configuration and error handling
- Introduced helper functions for loading authentication configuration and resolving values from environment variables.
- Updated uploadMistralOCR and uploadAzureMistralOCR functions to utilize the new configuration logic.
- Added utility functions for processing OCR results and creating error messages.
- Improved document type determination and result aggregation for better OCR processing.
* refactor: Reorganize OCR type imports in Mistral CRUD file
- Moved OCRResult, OCRResultPage, and OCRImage imports to a more logical grouping for better readability and maintainability.
* feat: Add file exports to API and create files index
* chore: Update OCR types for enhanced structure and clarity
- Redesigned OCRImage interface to include mandatory fields and improved naming conventions.
- Added PageDimensions interface for better representation of page metrics.
- Updated OCRResultPage to include dimensions and mandatory images array.
- Refined OCRResult to include document annotation and usage information.
* refactor: use TS counterpart of uploadOCR methods
* ci: Update MistralOCR tests to reflect new OCR result structure
* chore: Bump version of @librechat/api to 1.2.3 in package.json and package-lock.json
* chore: Update CONFIG_VERSION to 1.2.8
* chore: remove unused sendEvent function from config module (now imported from '@librechat/api')
* chore: remove MistralOCR service files and tests (now in '@librechat/api')
* ci: update logger import in ModelService tests to use @librechat/data-schemas
---------
Co-authored-by: arthurolivierfortin <arthurolivier.fortin@gmail.com>