* feat: Added "document parser" OCR strategy
The document parser uses libraries to parse the text out of known document types.
This lets LibreChat handle some complex document types without having to use a
secondary service (like Mistral or standing up a RAG API server).
To enable the document parser, set the ocr strategy to "document_parser" in
librechat.yaml.
We now support:
- PDFs using pdfjs
- DOCX using mammoth
- XLS/XLSX using SheetJS
(The associated packages were also added to the project.)
* fix: applied Copilot code review suggestions
- Properly calculate length of text based on UTF8.
- Avoid issues with loading / blocking PDF parsing.
* fix: improved docs on parseDocument()
* chore: move to packages/api for TS support
* refactor: make document processing the default ocr strategy
- Introduced support for additional document types in the OCR strategy, including PDF, DOCX, and XLS/XLSX.
- Updated the file upload handling to dynamically select the appropriate parsing strategy based on the file type.
- Refactored the document parsing functions to use asynchronous imports for improved performance and maintainability.
* test: add unit tests for processAgentFileUpload functionality
- Introduced a new test suite for the processAgentFileUpload function in process.spec.js.
- Implemented various test cases to validate OCR strategy selection based on file types, including PDF, DOCX, XLSX, and XLS.
- Mocked dependencies to ensure isolated testing of file upload handling and strategy selection logic.
- Enhanced coverage for scenarios involving OCR capability checks and default strategy fallbacks.
* chore: update pdfjs-dist version and enhance document parsing tests
- Bumped pdfjs-dist dependency to version 5.4.624 in both api and packages/api.
- Refactored document parsing tests to use 'originalname' instead of 'filename' for file objects.
- Added a new test case for parsing XLS files to improve coverage of document types supported by the parser.
- Introduced a sample XLS file for testing purposes.
* feat: enforce text size limit and improve OCR fallback handling in processAgentFileUpload
- Added a check to ensure extracted text does not exceed the 15MB storage limit, throwing an error if it does.
- Refactored the OCR handling logic to improve fallback behavior when the configured OCR fails, ensuring a more robust document processing flow.
- Enhanced unit tests to cover scenarios for oversized text and fallback mechanisms, ensuring proper error handling and functionality.
* fix: correct OCR URL construction in performOCR function
- Updated the OCR URL construction to ensure it correctly appends '/ocr' to the base URL if not already present, improving the reliability of the OCR request.
---------
Co-authored-by: Dan Lew <daniel@mightyacorn.com>
* 🪣 fix: S3 path-style URL support for MinIO, R2, and custom endpoints
`extractKeyFromS3Url` now uses `AWS_BUCKET_NAME` to automatically detect and
strip the bucket prefix from path-style URLs, fixing `NoSuchKey` errors on URL
refresh for any S3-compatible provider using a custom endpoint (MinIO, Cloudflare
R2, Hetzner, Backblaze B2, etc.). No additional configuration required — the
bucket name is already a required env var for S3 to function.
`initializeS3` now passes `forcePathStyle: true` to the S3Client constructor
when `AWS_FORCE_PATH_STYLE=true` is set. Required for providers whose SSL
certificates do not support virtual-hosted-style bucket subdomains (e.g. Hetzner
Object Storage), which previously caused 401 / SignatureDoesNotMatch on upload.
Additional fixes:
- Suppress error log noise in `extractKeyFromS3Url` catch path: plain S3 keys
no longer log as errors, only inputs that start with http(s):// do
- Fix test env var ordering so module-level constants pick up `AWS_BUCKET_NAME`
and `S3_URL_EXPIRY_SECONDS` correctly before the module is required
- Add missing `deleteRagFile` mock and assertion in `deleteFileFromS3` tests
- Add `AWS_BUCKET_NAME` cleanup to `afterEach` to prevent cross-test pollution
- Add `initializeS3` unit tests covering endpoint, forcePathStyle, credentials,
singleton, and IRSA code paths
- Document `AWS_FORCE_PATH_STYLE` in `.env.example`, `dotenv.mdx`, and `s3.mdx`
* 🪣 fix: Enhance S3 URL key extraction for custom endpoints
Updated `extractKeyFromS3Url` to support precise key extraction when using custom endpoints with path-style URLs. The logic now accounts for the `AWS_ENDPOINT_URL` and `AWS_FORCE_PATH_STYLE` environment variables, ensuring correct key handling for various S3-compatible providers.
Added unit tests to verify the new functionality, including scenarios for endpoints with base paths. This improves compatibility and reduces potential errors when interacting with S3-like services.
* ✨ feat: Enhance S3 URL handling and add comprehensive tests for CRUD operations
* 🔒 fix: Improve S3 URL key extraction with enhanced logging and additional test cases
* chore: removed some duplicate testcases and fixed incorrect apostrophes
* fix: Log error for malformed URLs
* test: Add additional test case for extracting keys from S3 URLs
* fix: Enhance S3 URL key extraction logic and improve error handling with additional test cases
* test: Add test case for stripping bucket from custom endpoint URLs with forcePathStyle enabled
* refactor: Update S3 path style handling and enhance environment configuration for S3-compatible services
* refactor: Remove S3_FORCE_PATH_STYLE dependency and streamline S3 URL key extraction logic
---------
Co-authored-by: Danny Avila <danny@librechat.ai>
* feat: Add support for Apache Parquet MIME types
- Introduced 'application/x-parquet' to the full MIME types list and code interpreter MIME types list.
- Updated application MIME types regex to include 'x-parquet' and 'vnd.apache.parquet'.
- Added mapping for '.parquet' files to 'application/x-parquet' in code type mapping, enhancing file format support.
* feat: Implement atomic file claiming for code execution outputs
- Added a new `claimCodeFile` function to atomically claim a file_id for code execution outputs, preventing duplicates by using a compound key of filename and conversationId.
- Updated `processCodeOutput` to utilize the new claiming mechanism, ensuring that concurrent calls for the same filename converge on a single record.
- Refactored related tests to validate the new atomic claiming behavior and its impact on file usage tracking and versioning.
* fix: Update image file handling to use cache-busting filepath
- Modified the `processCodeOutput` function to generate a cache-busting filepath for updated image files, improving browser caching behavior.
- Adjusted related tests to reflect the change from versioned filenames to cache-busted filepaths, ensuring accurate validation of image updates.
* fix: Update step handler to prevent undefined content for non-tool call types
- Modified the condition in useStepHandler to ensure that undefined content is only assigned for specific content types, enhancing the robustness of content handling.
* fix: Update bedrockOutputParser to handle maxTokens for adaptive models
- Modified the bedrockOutputParser logic to ensure that maxTokens is not set for adaptive models when neither maxTokens nor maxOutputTokens are provided, improving the handling of adaptive thinking configurations.
- Updated related tests to reflect these changes, ensuring accurate validation of the output for adaptive models.
* chore: Update @librechat/agents to version 3.1.38 in package.json and package-lock.json
* fix: Enhance file claiming and error handling in code processing
- Updated the `processCodeOutput` function to use a consistent file ID for claiming files, preventing duplicates and improving concurrency handling.
- Refactored the `createFileMethods` to include error handling for failed file claims, ensuring robust behavior when claiming files for conversations.
- These changes enhance the reliability of file management in the application.
* fix: Update adaptive thinking test for Opus 4.6 model
- Modified the test for configuring adaptive thinking to reflect that no default maxTokens should be set for the Opus 4.6 model.
- Updated assertions to ensure that maxTokens is undefined, aligning with the expected behavior for adaptive models.
* refactor: process code output files for re-use (WIP)
* feat: file attachment handling with additional metadata for downloads
* refactor: Update directory path logic for local file saving based on basePath
* refactor: file attachment handling to support TFile type and improve data merging logic
* feat: thread filtering of code-generated files
- Introduced parentMessageId parameter in addedConvo and initialize functions to enhance thread management.
- Updated related methods to utilize parentMessageId for retrieving messages and filtering code-generated files by conversation threads.
- Enhanced type definitions to include parentMessageId in relevant interfaces for better clarity and usage.
* chore: imports/params ordering
* feat: update file model to use messageId for filtering and processing
- Changed references from 'message' to 'messageId' in file-related methods for consistency.
- Added messageId field to the file schema and updated related types.
- Enhanced file processing logic to accommodate the new messageId structure.
* feat: enhance file retrieval methods to support user-uploaded execute_code files
- Added a new method `getUserCodeFiles` to retrieve user-uploaded execute_code files, excluding code-generated files.
- Updated existing file retrieval methods to improve filtering logic and handle edge cases.
- Enhanced thread data extraction to collect both message IDs and file IDs efficiently.
- Integrated `getUserCodeFiles` into relevant endpoints for better file management in conversations.
* chore: update @librechat/agents package version to 3.0.78 in package-lock.json and related package.json files
* refactor: file processing and retrieval logic
- Added a fallback mechanism for download URLs when files exceed size limits or cannot be processed locally.
- Implemented a deduplication strategy for code-generated files based on conversationId and filename to optimize storage.
- Updated file retrieval methods to ensure proper filtering by messageIds, preventing orphaned files from being included.
- Introduced comprehensive tests for new thread data extraction functionality, covering edge cases and performance considerations.
* fix: improve file retrieval tests and handling of optional properties
- Updated tests to safely access optional properties using non-null assertions.
- Modified test descriptions for clarity regarding the exclusion of execute_code files.
- Ensured that the retrieval logic correctly reflects the expected outcomes for file queries.
* test: add comprehensive unit tests for processCodeOutput functionality
- Introduced a new test suite for the processCodeOutput function, covering various scenarios including file retrieval, creation, and processing for both image and non-image files.
- Implemented mocks for dependencies such as axios, logger, and file models to isolate tests and ensure reliable outcomes.
- Validated behavior for existing files, new file creation, and error handling, including size limits and fallback mechanisms.
- Enhanced test coverage for metadata handling and usage increment logic, ensuring robust verification of file processing outcomes.
* test: enhance file size limit enforcement in processCodeOutput tests
- Introduced a configurable file size limit for tests to improve flexibility and coverage.
- Mocked the `librechat-data-provider` to allow dynamic adjustment of file size limits during tests.
- Updated the file size limit enforcement test to validate behavior when files exceed specified limits, ensuring proper fallback to download URLs.
- Reset file size limit after tests to maintain isolation for subsequent test cases.
* refactor: move endpoint initialization methods to typescript
* refactor: move agent init to packages/api
- Introduced `initialize.ts` for agent initialization, including file processing and tool loading.
- Updated `resources.ts` to allow optional appConfig parameter.
- Enhanced endpoint configuration handling in various initialization files to support model parameters.
- Added new artifacts and prompts for React component generation.
- Refactored existing code to improve type safety and maintainability.
* refactor: streamline endpoint initialization and enhance type safety
- Updated initialization functions across various endpoints to use a consistent request structure, replacing `unknown` types with `ServerResponse`.
- Simplified request handling by directly extracting keys from the request body.
- Improved type safety by ensuring user IDs are safely accessed with optional chaining.
- Removed unnecessary parameters and streamlined model options handling for better clarity and maintainability.
* refactor: moved ModelService and extractBaseURL to packages/api
- Added comprehensive tests for the models fetching functionality, covering scenarios for OpenAI, Anthropic, Google, and Ollama models.
- Updated existing endpoint index to include the new models module.
- Enhanced utility functions for URL extraction and model data processing.
- Improved type safety and error handling across the models fetching logic.
* refactor: consolidate utility functions and remove unused files
- Merged `deriveBaseURL` and `extractBaseURL` into the `@librechat/api` module for better organization.
- Removed redundant utility files and their associated tests to streamline the codebase.
- Updated imports across various client files to utilize the new consolidated functions.
- Enhanced overall maintainability by reducing the number of utility modules.
* refactor: replace ModelService references with direct imports from @librechat/api and remove ModelService file
* refactor: move encrypt/decrypt methods and key db methods to data-schemas, use `getProviderConfig` from `@librechat/api`
* chore: remove unused 'res' from options in AgentClient
* refactor: file model imports and methods
- Updated imports in various controllers and services to use the unified file model from '~/models' instead of '~/models/File'.
- Consolidated file-related methods into a new file methods module in the data-schemas package.
- Added comprehensive tests for file methods including creation, retrieval, updating, and deletion.
- Enhanced the initializeAgent function to accept dependency injection for file-related methods.
- Improved error handling and logging in file methods.
* refactor: streamline database method references in agent initialization
* refactor: enhance file method tests and update type references to IMongoFile
* refactor: consolidate database method imports in agent client and initialization
* chore: remove redundant import of initializeAgent from @librechat/api
* refactor: move checkUserKeyExpiry utility to @librechat/api and update references across endpoints
* refactor: move updateUserPlugins logic to user.ts and simplify UserController
* refactor: update imports for user key management and remove UserService
* refactor: remove unused Anthropics and Bedrock endpoint files and clean up imports
* refactor: consolidate and update encryption imports across various files to use @librechat/data-schemas
* chore: update file model mock to use unified import from '~/models'
* chore: import order
* refactor: remove migrated to TS agent.js file and its associated logic from the endpoints
* chore: add reusable function to extract imports from source code in unused-packages workflow
* chore: enhance unused-packages workflow to include @librechat/api dependencies and improve dependency extraction
* chore: improve dependency extraction in unused-packages workflow with enhanced error handling and debugging output
* chore: add detailed debugging output to unused-packages workflow for better visibility into unused dependencies and exclusion lists
* chore: refine subpath handling in unused-packages workflow to correctly process scoped and non-scoped package imports
* chore: clean up unused debug output in unused-packages workflow and reorganize type imports in initialize.ts
- Added try-catch blocks to handle errors during document deletion from the RAG API.
- Implemented logging for 404 errors indicating that the document may have already been deleted.
- Improved error logging for other deletion errors in both Firebase and Local file services.
* 🔊 fix: Validate language format for OpenAI STT model
* fix: Normalize input language model assignment in STTService
* refactor: Enhance error logging and language validation in STT and TTS services
* fix: Improve language validation in getValidatedLanguageCode function
* 🗑️ chore: Remove @microsoft/eslint-formatter-sarif from dependencies and update ESLint CI workflow
- Removed @microsoft/eslint-formatter-sarif from package.json and package-lock.json.
- Updated ESLint CI workflow to eliminate SARIF upload logic and related environment variables.
* chore: Remove ts-jest from dependencies in jest.config and package files
* chore: Update package dependencies to latest versions
- Upgraded @rollup/plugin-commonjs from 25.0.2 to 29.0.0 across multiple packages.
- Updated rimraf from 5.0.1 to 6.1.2 in packages/api, client, data-provider, and data-schemas.
- Added new dependencies: @isaacs/balanced-match and @isaacs/brace-expansion in package-lock.json.
- Updated glob from 8.1.0 to 13.0.0 and adjusted related dependencies accordingly.
* chore: remove prettier-eslint dependency from package.json
* chore: npm audit fix
* fix: correct `getBasePath` import
* fix: add basePath pattern to support login/register and image paths
* Fix linter errors
* refactor: Update import statements for getBasePath and isEnabled, and add path utility functions with tests
- Refactored imports in addImages.js and StableDiffusion.js to use getBasePath from '@librechat/api'.
- Consolidated isEnabled and getBasePath imports in validateImageRequest.js.
- Introduced new path utility functions in path.ts and corresponding unit tests in path.spec.ts to validate base path extraction logic.
* fix: Update domain server base URL in MarkdownComponents and refactor authentication redirection logic
- Changed the domain server base URL in MarkdownComponents.tsx to use the API base URL.
- Refactored the useAuthRedirect hook to utilize React Router's navigate for redirection instead of window.location, ensuring a smoother SPA experience.
- Added unit tests for the useAuthRedirect hook to verify authentication redirection behavior.
* test: Mock isEnabled in validateImages.spec.js for improved test isolation
- Updated validateImages.spec.js to mock the isEnabled function from @librechat/api, ensuring that tests can run independently of the actual implementation.
- Cleared the DOMAIN_CLIENT environment variable before tests to avoid interference with basePath resolution.
---------
Co-authored-by: Danny Avila <danny@librechat.ai>
* refactor: add image file size validation as part of payload build
* feat: implement file size and MIME type filtering in endpoint configuration
* chore: import order
* feat: add filterFilesByEndpointConfig to filter disabled file processing by provider
* chore: explicit define of endpointFileConfig for better debugging
* refactor: move `normalizeEndpointName` to data-provider as used app-wide
* chore: remove overrideEndpoint from useFileHandling
* refactor: improve endpoint file config selection
* refactor: update filterFilesByEndpointConfig to accept structured parameters and improve endpoint file config handling
* refactor: replace defaultFileConfig with getEndpointFileConfig for improved file configuration handling across components
* test: add comprehensive unit tests for getEndpointFileConfig to validate endpoint configuration handling
* refactor: streamline agent endpoint assignment and improve file filtering logic
* feat: add error handling for disabled file uploads in endpoint configuration
* refactor: update encodeAndFormat functions to accept structured parameters for provider and endpoint
* refactor: streamline requestFiles handling in initializeAgent function
* fix: getEndpointFileConfig partial config merging scenarios
* refactor: enhance mergeWithDefault function to support document-supported providers with comprehensive MIME types
* refactor: user-configured default file config in getEndpointFileConfig
* fix: prevent file handling when endpoint is disabled and file is dragged to chat
* refactor: move `getEndpointField` to `data-provider` and update usage across components and hooks
* fix: prioritize endpointType based on agent.endpoint in file filtering logic
* fix: prioritize agent.endpoint in file filtering logic and remove unnecessary endpointType defaulting
* chore: correct startupConfig usage in ImportConversations component
* refactor: properly process configured speechToText and textToSpeech settings in getCustomConfigSpeech
* refactor: proxy configuration by utilizing HttpsProxyAgent for OpenAI Image Edits
- problem: `addImageUrls` had a side effect that was being leveraged before to populate both the `ocr` message field, now `fileContext`, and `client.options.attachments`, which would record the user's uploaded message attachments to the user message when saved to the database and returned at the end of the request lifecycle
- solution: created dedicated handling for file context, and made sure to populate `allFiles` with non-provider attachments
* 📎 feat: Direct Provider Attachment Support for Multimodal Content
* 📑 feat: Anthropic Direct Provider Upload (#9072)
* feat: implement Anthropic native PDF support with document preservation
- Add comprehensive debug logging throughout PDF processing pipeline
- Refactor attachment processing to separate image and document handling
- Create distinct addImageURLs(), addDocuments(), and processAttachments() methods
- Fix critical bugs in stream handling and parameter passing
- Add streamToBuffer utility for proper stream-to-buffer conversion
- Remove api/agents submodule from repository
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* chore: remove out of scope formatting changes
* fix: stop duplication of file in chat on end of response stream
* chore: bring back file search and ocr options
* chore: localize upload to provider string in file menu
* refactor: change createMenuItems args to fit new pattern introduced by anthropic-native-pdf-support
* feat: add cache point for pdfs processed by anthropic endpoint since they are unlikely to change and should benefit from caching
* feat: combine Upload Image into Upload to Provider since they both perform direct upload and change provider upload icon to reflect multimodal upload
* feat: add citations support according to docs
* refactor: remove redundant 'document' check since documents are handled properly by formatMessage in the agents repo now
* refactor: change upload logic so anthropic endpoint isn't exempted from normal upload path using Agents for consistency with the rest of the upload logic
* fix: include width and height in return from uploadLocalFile so images are correctly identified when going through an AgentUpload in addImageURLs
* chore: remove client specific handling since the direct provider stuff is handled by the agent client
* feat: handle documents in AgentClient so no need for change to agents repo
* chore: removed unused changes
* chore: remove auto generated comments from OG commit
* feat: add logic for agents to use direct to provider uploads if supported (currently just anthropic)
* fix: reintroduce role check to fix render error because of undefined value for Content Part
* fix: actually fix render bug by using proper isCreatedByUser check and making sure our mutation of formattedMessage.content is consistent
---------
Co-authored-by: Andres Restrepo <andres@thelinuxkid.com>
Co-authored-by: Claude <noreply@anthropic.com>
📁 feat: Send Attachments Directly to Provider (OpenAI) (#9098)
* refactor: change references from direct upload to direct attach to better reflect functionality
since we are just using base64 encoding strategy now rather than Files/File API for sending our attachments directly to the provider, the upload nomenclature no longer makes sense. direct_attach better describes the different methods of sending attachments to providers anyways even if we later introduce direct upload support
* feat: add upload to provider option for openai (and agent) ui
* chore: move anthropic pdf validator over to packages/api
* feat: simple pdf validation according to openai docs
* feat: add provider agnostic validatePdf logic to start handling multiple endpoints
* feat: add handling for openai specific documentPart formatting
* refactor: move require statement to proper place at top of file
* chore: add in openAI endpoint for the rest of the document handling logic
* feat: add direct attach support for azureOpenAI endpoint and agents
* feat: add pdf validation for azureOpenAI endpoint
* refactor: unify all the endpoint checks with isDocumentSupportedEndpoint
* refactor: consolidate Upload to Provider vs Upload image logic for clarity
* refactor: remove anthropic from anthropic_multimodal fileType since we support multiple providers now
🗂️ feat: Send Attachments Directly to Provider (Google) (#9100)
* feat: add validation for google PDFs and add google endpoint as a document supporting endpoint
* feat: add proper pdf formatting for google endpoints (requires PR #14 in agents)
* feat: add multimodal support for google endpoint attachments
* feat: add audio file svg
* fix: refactor attachments logic so multi-attachment messages work properly
* feat: add video file svg
* fix: allows for followup questions of uploaded multimodal attachments
* fix: remove incorrect final message filtering that was breaking Attachment component rendering
fix: manualy rename 'documents' to 'Documents' in git since it wasn't picked up due to case insensitivity in dir name
fix: add logic so filepicker for a google agent has proper filetype filtering
🛫 refactor: Move Encoding Logic to packages/api (#9182)
* refactor: move audio encode over to TS
* refactor: audio encoding now functional in LC again
* refactor: move video encode over to TS
* refactor: move document encode over to TS
* refactor: video encoding now functional in LC again
* refactor: document encoding now functional in LC again
* fix: extend file type options in AttachFileMenu to include 'google_multimodal' and update dependency array to include agent?.provider
* feat: only accept pdfs if responses api is enabled for openai convos
chore: address ESLint comments
chore: add missing audio mimetype
* fix: type safety for message content parts and improve null handling
* chore: reorder AttachFileMenuProps for consistency and clarity
* chore: import order in AttachFileMenu
* fix: improve null handling for text parts in parseTextParts function
* fix: remove no longer used unsupported capability error message for file uploads
* fix: OpenAI Direct File Attachment Format
* fix: update encodeAndFormatDocuments to support OpenAI responses API and enhance document result types
* refactor: broaden providers supported for documents
* feat: enhance DragDrop context and modal to support document uploads based on provider capabilities
* fix: reorder import statements for consistency in video encoding module
---------
Co-authored-by: Dustin Healy <54083382+dustinhealy@users.noreply.github.com>
* chore: linting for `loadCustomConfig`
* refactor: decouple CDN init and variable/health checks from AppService
* refactor: move AppService to packages/data-schemas
* chore: update AppConfig import path to use data-schemas
* chore: update JsonSchemaType import path to use data-schemas
* refactor: update UserController to import webSearchKeys and redefine FunctionTool typedef
* chore: remove AppService.js
* refactor: update AppConfig interface to use Partial<TCustomConfig> and make paths and fileStrategies optional
* refactor: update checkConfig function to accept Partial<TCustomConfig>
* chore: fix types
* refactor: move handleRateLimits to startup checks as is an effect
* test: remove outdated rate limit tests from AppService.spec and add new handleRateLimits tests in checks.spec
* 🧹 chore: Update logger imports to use @librechat/data-schemas across multiple files and remove unused sleep function from queue.js (#9930)
* chore: Replace local isEnabled utility with @librechat/api import across multiple files, update test files
* chore: Replace local logger import with @librechat/data-schemas logger in countTokens.js and fork.js
* chore: Update logs volume path in docker-compose.yml to correct directory
* chore: import order of isEnabled in static.js
* refactor: move `loadOCRConfig` from `packages/data-provider` to `packages/api` and return `undefined` if not explicitly configured
* fix: loadOCRConfig import from @librechat/api
* refactor: update defaultTextMimeTypes to support virtually all file types for text parsing
* fix: improve OCR capability check and error message for unsupported file types
* ci: remove unnecessary ocr expectation from AppService test
* WIP: conversion of `ocr` to `context`
* refactor: make `primeResources` backwards-compatible for `ocr` tool_resources
* refactor: Convert legacy `ocr` tool resource to `context` in agent updates
- Implemented conversion logic to replace `ocr` with `context` in both incoming updates and existing agent data.
- Merged file IDs and files from `ocr` into `context` while ensuring deduplication.
- Updated tools array to reflect the change from `ocr` to `context`.
* refactor: Enhance context file handling in agent processing
- Updated the logic for managing context files by consolidating file IDs from both `ocr` and `context` resources.
- Improved backwards compatibility by ensuring that context files are correctly populated and handled.
- Simplified the iteration over context files for better readability and maintainability.
* refactor: Enhance tool_resources handling in primeResources
- Added tests to verify the deletion behavior of tool_resources fields, ensuring original objects remain unchanged.
- Implemented logic to delete `ocr` and `context` fields after fetching and re-categorizing files.
- Preserved context field when the context capability is disabled, ensuring correct behavior in various scenarios.
* refactor: Replace `ocrEnabled` with `contextEnabled` in AgentConfig
* refactor: Adjust legacy tool handling order for improved clarity
* refactor: Implement OCR to context conversion functions and remove original conversion logic in update agent handling
* refactor: Move contextEnabled declaration to maintain consistent order in capabilities
* refactor: Update localization keys for file context to improve clarity and accuracy
* chore: Update localization key for file context information to improve clarity
* Make file search citations conditional
* refactor: improve permission handling to avoid redundant checks by including it in artifact
* chore: reorder imports for better organization and clarity
---------
Co-authored-by: Danny Avila <danny@librechat.ai>
* 🔧 fix: TTS and STT Services to use AppConfig
- Updated `getProviderSchema` and `getProvider` methods to accept an optional `appConfig` parameter, allowing for more flexible configuration retrieval.
- Improved error handling by ensuring that the app configuration is checked before accessing TTS and STT schemas.
- Refactored `processTextToSpeech` and `streamAudio` methods to utilize the new `appConfig` parameter for better clarity and maintainability.
* feat: Cumulative Transcription Support for STT External
* style: fix medium-sized styling for admin settings dialogs
* 🪶 feat: Add Support for Uploading Plaintext Files
feat: delineate between OCR and text handling in fileConfig field of config file
- also adds support for passing in mimetypes as just plain file extensions
feat: add showLabel bool to support future synthetic component DynamicDropdownInput
feat: add new combination dropdown-input component in params panel to support file type token limits
refactor: move hovercard to side to align with other hovercards
chore: clean up autogenerated comments
feat: add delineation to file upload path between text and ocr configured filetypes
feat: add token limit checks during file upload
refactor: move textParsing out of ocrEnabled logic
refactor: clean up types for filetype config
refactor: finish decoupling DynamicDropdownInput from fileTokenLimits
fix: move image token cost function into file to fix circular dependency causing unittest to fail and remove unused var for linter
chore: remove out of scope code following review
refactor: make fileTokenLimit conform to existing styles
chore: remove unused localization string
chore: undo changes to DynamicInput and other strays
feat: add fileTokenLimit to all provider config panels
fix: move textParsing back into ocr tool_resource block for now so that it doesn't interfere with other upload types
* 📤 feat: Add RAG API Endpoint Support for Text Parsing (#8849)
* feat: implement RAG API integration for text parsing with fallback to native parsing
* chore: remove TODO now that placeholder and fllback are implemented
* ✈️ refactor: Migrate Text Parsing to TS (#8892)
* refactor: move generateShortLivedToken to packages/api
* refactor: move textParsing logic into packages/api
* refactor: reduce nesting and dry code with createTextFile
* fix: add proper source handling
* fix: mock new parseText and parseTextNative functions in jest file
* ci: add test coverage for textParser
* 💬 feat: Add Audio File Support to Upload as Text (#8893)
* feat: add STT support for Upload as Text
* refactor: move processAudioFile to packages/api
* refactor: move textParsing from utils to files
* fix: remove audio/mp3 from unsupported mimetypes test since it is now supported
* ✂️ feat: Configurable File Token Limits and Truncation (#8911)
* feat: add configurable fileTokenLimit default value
* fix: add stt to fileConfig merge logic
* fix: add fileTokenLimit to mergeFileConfig logic so configurable value is actually respected from yaml
* feat: add token limiting to parsed text files
* fix: add extraction logic and update tests so fileTokenLimit isnt sent to LLM providers
* fix: address comments
* refactor: rename textTokenLimiter.ts to text.ts
* chore: update form-data package to address CVE-2025-7783 and update package-lock
* feat: use default supported mime types for ocr on frontend file validation
* fix: should be using logger.debug not console.debug
* fix: mock existsSync in text.spec.ts
* fix: mock logger rather than every one of its function calls
* fix: reorganize imports and streamline file upload processing logic
* refactor: update createTextFile function to use destructured parameters and improve readability
* chore: update file validation to use EToolResources for improved type safety
* chore: update import path for types in audio processing module
* fix: update file configuration access and replace console.debug with logger.debug for improved logging
---------
Co-authored-by: Dustin Healy <dustinhealy1@gmail.com>
Co-authored-by: Dustin Healy <54083382+dustinhealy@users.noreply.github.com>
* WIP: app.locals refactoring
WIP: appConfig
fix: update memory configuration retrieval to use getAppConfig based on user role
fix: update comment for AppConfig interface to clarify purpose
🏷️ refactor: Update tests to use getAppConfig for endpoint configurations
ci: Update AppService tests to initialize app config instead of app.locals
ci: Integrate getAppConfig into remaining tests
refactor: Update multer storage destination to use promise-based getAppConfig and improve error handling in tests
refactor: Rename initializeAppConfig to setAppConfig and update related tests
ci: Mock getAppConfig in various tests to provide default configurations
refactor: Update convertMCPToolsToPlugins to use mcpManager for server configuration and adjust related tests
chore: rename `Config/getAppConfig` -> `Config/app`
fix: streamline OpenAI image tools configuration by removing direct appConfig dependency and using function parameters
chore: correct parameter documentation for imageOutputType in ToolService.js
refactor: remove `getCustomConfig` dependency in config route
refactor: update domain validation to use appConfig for allowed domains
refactor: use appConfig registration property
chore: remove app parameter from AppService invocation
refactor: update AppConfig interface to correct registration and turnstile configurations
refactor: remove getCustomConfig dependency and use getAppConfig in PluginController, multer, and MCP services
refactor: replace getCustomConfig with getAppConfig in STTService, TTSService, and related files
refactor: replace getCustomConfig with getAppConfig in Conversation and Message models, update tempChatRetention functions to use AppConfig type
refactor: update getAppConfig calls in Conversation and Message models to include user role for temporary chat expiration
ci: update related tests
refactor: update getAppConfig call in getCustomConfigSpeech to include user role
fix: update appConfig usage to access allowedDomains from actions instead of registration
refactor: enhance AppConfig to include fileStrategies and update related file strategy logic
refactor: update imports to use normalizeEndpointName from @librechat/api and remove redundant definitions
chore: remove deprecated unused RunManager
refactor: get balance config primarily from appConfig
refactor: remove customConfig dependency for appConfig and streamline loadConfigModels logic
refactor: remove getCustomConfig usage and use app config in file citations
refactor: consolidate endpoint loading logic into loadEndpoints function
refactor: update appConfig access to use endpoints structure across various services
refactor: implement custom endpoints configuration and streamline endpoint loading logic
refactor: update getAppConfig call to include user role parameter
refactor: streamline endpoint configuration and enhance appConfig usage across services
refactor: replace getMCPAuthMap with getUserMCPAuthMap and remove unused getCustomConfig file
refactor: add type annotation for loadedEndpoints in loadEndpoints function
refactor: move /services/Files/images/parse to TS API
chore: add missing FILE_CITATIONS permission to IRole interface
refactor: restructure toolkits to TS API
refactor: separate manifest logic into its own module
refactor: consolidate tool loading logic into a new tools module for startup logic
refactor: move interface config logic to TS API
refactor: migrate checkEmailConfig to TypeScript and update imports
refactor: add FunctionTool interface and availableTools to AppConfig
refactor: decouple caching and DB operations from AppService, make part of consolidated `getAppConfig`
WIP: fix tests
* fix: rebase conflicts
* refactor: remove app.locals references
* refactor: replace getBalanceConfig with getAppConfig in various strategies and middleware
* refactor: replace appConfig?.balance with getBalanceConfig in various controllers and clients
* test: add balance configuration to titleConvo method in AgentClient tests
* chore: remove unused `openai-chat-tokens` package
* chore: remove unused imports in initializeMCPs.js
* refactor: update balance configuration to use getAppConfig instead of getBalanceConfig
* refactor: integrate configMiddleware for centralized configuration handling
* refactor: optimize email domain validation by removing unnecessary async calls
* refactor: simplify multer storage configuration by removing async calls
* refactor: reorder imports for better readability in user.js
* refactor: replace getAppConfig calls with req.config for improved performance
* chore: replace getAppConfig calls with req.config in tests for centralized configuration handling
* chore: remove unused override config
* refactor: add configMiddleware to endpoint route and replace getAppConfig with req.config
* chore: remove customConfig parameter from TTSService constructor
* refactor: pass appConfig from request to processFileCitations for improved configuration handling
* refactor: remove configMiddleware from endpoint route and retrieve appConfig directly in getEndpointsConfig if not in `req.config`
* test: add mockAppConfig to processFileCitations tests for improved configuration handling
* fix: pass req.config to hasCustomUserVars and call without await after synchronous refactor
* fix: type safety in useExportConversation
* refactor: retrieve appConfig using getAppConfig in PluginController and remove configMiddleware from plugins route, to avoid always retrieving when plugins are cached
* chore: change `MongoUser` typedef to `IUser`
* fix: Add `user` and `config` fields to ServerRequest and update JSDoc type annotations from Express.Request to ServerRequest
* fix: remove unused setAppConfig mock from Server configuration tests
WIP: Role as Permission Principal Type
WIP: add user role check optimization to user principal check, update type comparisons
WIP: cover edge cases for string vs ObjectId handling in permission granting and checking
chore: Update people picker access middleware to use PrincipalType constants
feat: Enhance people picker access control to include roles permissions
chore: add missing default role schema values for people picker perms, cleanup typing
feat: Enhance PeoplePicker component with role-specific UI and localization updates
chore: Add missing `VIEW_ROLES` permission to role schema
- Replaced string literals for principal models ('User', 'Group') with the new PrincipalModel enum across various models, services, and tests to enhance type safety and consistency.
- Updated permission handling in multiple files to utilize the PrincipalModel enum, improving maintainability and reducing potential errors.
- Ensured all relevant tests reflect these changes to maintain coverage and functionality.
- Replaced string literals for principal types ('user', 'group', 'public') with the new PrincipalType enum across various models, services, and tests for improved type safety and consistency.
- Updated permission handling in multiple files to utilize the PrincipalType enum, enhancing maintainability and reducing potential errors.
- Ensured all relevant tests reflect these changes to maintain coverage and functionality.
refactor: organize Sharing/Agent components, improve type safety for resource types and access role ids, rename enums to PascalCase
refactor: organize Sharing/Agent components, improve type safety for resource types and access role ids
chore: move sharing related components to dedicated "Sharing" directory
chore: remove PublicSharingToggle component and update index exports
chore: move non-sidepanel agent components to `~/components/Agents`
chore: move AgentCategoryDisplay component with tests
chore: remove commented out code
refactor: change PERMISSION_BITS from const to enum for better type safety
refactor: reorganize imports in GenericGrantAccessDialog and update index exports for hooks
refactor: update type definitions to use ACCESS_ROLE_IDS for improved type safety
refactor: remove unused canAccessPromptResource middleware and related code
refactor: remove unused prompt access roles from createAccessRoleMethods
refactor: update resourceType in AclEntry type definition to remove unused 'prompt' value
refactor: introduce ResourceType enum and update resourceType usage across data provider files for improved type safety
refactor: update resourceType usage to ResourceType enum across sharing and permissions components for improved type safety
refactor: standardize resourceType usage to ResourceType enum across agent and prompt models, permissions controller, and middleware for enhanced type safety
refactor: update resourceType references from PROMPT_GROUP to PROMPTGROUP for consistency across models, middleware, and components
refactor: standardize access role IDs and resource type usage across agent, file, and prompt models for improved type safety and consistency
chore: add typedefs for TUpdateResourcePermissionsRequest and TUpdateResourcePermissionsResponse to enhance type definitions
chore: move SearchPicker to PeoplePicker dir
refactor: implement debouncing for query changes in SearchPicker for improved performance
chore: fix typing, import order for agent admin settings
fix: agent admin settings, prevent agent form submission
refactor: rename `ACCESS_ROLE_IDS` to `AccessRoleIds`
refactor: replace PermissionBits with PERMISSION_BITS
refactor: replace PERMISSION_BITS with PermissionBits
bugfix: Enhance Agent and AgentCategory schemas with new fields for category, support contact, and promotion status
refactored and moved agent category methods and schema to data-schema package
🔧 fix: Merge and Rebase Conflicts
- Move AgentCategory from api/models to @packages/data-schemas structure
- Add schema, types, methods, and model following codebase conventions
- Implement auto-seeding of default categories during AppService startup
- Update marketplace controller to use new data-schemas methods
- Remove old model file and standalone seed script
refactor: unify agent marketplace to single endpoint with cursor pagination
- Replace multiple marketplace routes with unified /marketplace endpoint
- Add query string controls: category, search, limit, cursor, promoted, requiredPermission
- Implement cursor-based pagination replacing page-based system
- Integrate ACL permissions for proper access control
- Fix ObjectId constructor error in Agent model
- Update React components to use unified useGetMarketplaceAgentsQuery hook
- Enhance type safety and remove deprecated useDynamicAgentQuery
- Update tests for new marketplace architecture
-Known issues:
see more button after category switching + Unit tests
feat: add icon property to ProcessedAgentCategory interface
- Add useMarketplaceAgentsInfiniteQuery and useGetAgentCategoriesQuery to client/src/data-provider/Agents/
- Replace manual pagination in AgentGrid with infinite query pattern
- Update imports to use local data provider instead of librechat-data-provider
- Add proper permission handling with PERMISSION_BITS.VIEW/EDIT constants
- Improve agent access control by adding requiredPermission validation in backend
- Remove manual cursor/state management in favor of infinite query built-ins
- Maintain existing search and category filtering functionality
refactor: consolidate agent marketplace endpoints into main agents API and improve data management consistency
- Remove dedicated marketplace controller and routes, merging functionality into main agents v1 API
- Add countPromotedAgents function to Agent model for promoted agents count
- Enhance getListAgents handler with marketplace filtering (category, search, promoted status)
- Move getAgentCategories from marketplace to v1 controller with same functionality
- Update agent mutations to invalidate marketplace queries and handle multiple permission levels
- Improve cache management by updating all agent query variants (VIEW/EDIT permissions)
- Consolidate agent data access patterns for better maintainability and consistency
- Remove duplicate marketplace route definitions and middleware
selected view only agents injected in the drop down
fix: remove minlength validation for support contact name in agent schema
feat: add validation and error messages for agent name in AgentConfig and AgentPanel
fix: update agent permission check logic in AgentPanel to simplify condition
Fix linting WIP
Fix Unit tests WIP
ESLint fixes
eslint fix
refactor: enhance isDuplicateVersion function in Agent model for improved comparison logic
- Introduced handling for undefined/null values in array and object comparisons.
- Normalized array comparisons to treat undefined/null as empty arrays.
- Added deep comparison for objects and improved handling of primitive values.
- Enhanced projectIds comparison to ensure consistent MongoDB ObjectId handling.
refactor: remove redundant properties from IAgent interface in agent schema
chore: update localization for agent detail component and clean up imports
ci: update access middleware tests
chore: remove unused PermissionTypes import from Role model
ci: update AclEntry model tests
ci: update button accessibility labels in AgentDetail tests
refactor: update exhaustive dep. lint warning
🔧 fix: Fixed agent actions access
feat: Add role-level permissions for agent sharing people picker
- Add PEOPLE_PICKER permission type with VIEW_USERS and VIEW_GROUPS permissions
- Create custom middleware for query-aware permission validation
- Implement permission-based type filtering in PeoplePicker component
- Hide people picker UI when user lacks permissions, show only public toggle
- Support granular access: users-only, groups-only, or mixed search modes
refactor: Replace marketplace interface config with permission-based system
- Add MARKETPLACE permission type to handle marketplace access control
- Update interface configuration to use role-based marketplace settings (admin/user)
- Replace direct marketplace boolean config with permission-based checks
- Modify frontend components to use marketplace permissions instead of interface config
- Update agent query hooks to use marketplace permissions for determining permission levels
- Add marketplace configuration structure similar to peoplePicker in YAML config
- Backend now sets MARKETPLACE permissions based on interface configuration
- When marketplace enabled: users get agents with EDIT permissions in dropdown lists (builder mode)
- When marketplace disabled: users get agents with VIEW permissions in dropdown lists (browse mode)
🔧 fix: Redirect to New Chat if No Marketplace Access and Required Agent Name Placeholder (#8213)
* Fix: Fix the redirect to new chat page if access to marketplace is denied
* Fixed the required agent name placeholder
---------
Co-authored-by: Atef Bellaaj <slalom.bellaaj@external.daimlertruck.com>
chore: fix tests, remove unnecessary imports
refactor: Implement permission checks for file access via agents
- Updated `hasAccessToFilesViaAgent` to utilize permission checks for VIEW and EDIT access.
- Replaced project-based access validation with permission-based checks.
- Enhanced tests to cover new permission logic and ensure proper access control for files associated with agents.
- Cleaned up imports and initialized models in test files for consistency.
refactor: Enhance test setup and cleanup for file access control
- Introduced modelsToCleanup array to track models added during tests for proper cleanup.
- Updated afterAll hooks in test files to ensure all collections are cleared and only added models are deleted.
- Improved consistency in model initialization across test files.
- Added comments for clarity on cleanup processes and test data management.
chore: Update Jest configuration and test setup for improved timeout handling
- Added a global test timeout of 30 seconds in jest.config.js.
- Configured jest.setTimeout in jestSetup.js to allow individual test overrides if needed.
- Enhanced test reliability by ensuring consistent timeout settings across all tests.
refactor: Implement file access filtering based on agent permissions
- Introduced `filterFilesByAgentAccess` function to filter files based on user access through agents.
- Updated `getFiles` and `primeFiles` functions to utilize the new filtering logic.
- Moved `hasAccessToFilesViaAgent` function from the File model to permission services, adjusting imports accordingly
- Enhanced tests to ensure proper access control and filtering behavior for files associated with agents.
fix: make support_contact field a nested object rather than a sub-document
refactor: Update support_contact field initialization in agent model
- Removed handling for empty support_contact object in createAgent function.
- Changed default value of support_contact in agent schema to undefined.
test: Add comprehensive tests for support_contact field handling and versioning
refactor: remove unused avatar upload mutation field and add informational toast for success
chore: add missing SidePanelProvider for AgentMarketplace and organize imports
fix: resolve agent selection race condition in marketplace HandleStartChat
- Set agent in localStorage before newConversation to prevent useSelectorEffects from auto-selecting previous agent
fix: resolve agent dropdown showing raw ID instead of agent info from URL
- Add proactive agent fetching when agent_id is present in URL parameters
- Inject fetched agent into agents cache so dropdowns display proper name/avatar
- Use useAgentsMap dependency to ensure proper cache initialization timing
- Prevents raw agent IDs from showing in UI when visiting shared agent links
Fix: Agents endpoint renamed to "My Agent" for less confusion with the Marketplace agents.
chore: fix ESLint issues and Test Mocks
ci: update permissions structure in loadDefaultInterface tests
- Refactored permissions for MEMORY and added new permissions for MARKETPLACE and PEOPLE_PICKER.
- Ensured consistent structure for permissions across different types.
feat: support_contact validation to allow empty email strings
* Implemented new uploadGoogleVertexMistralOCR function for processing OCR using Google Vertex AI.
* Added vertexMistralOCRStrategy to handle file uploads.
* Updated FileSources and OCRStrategy enums to include vertexai_mistral_ocr.
* Introduced helper functions for JWT creation and Google service account configuration loading.
* 👁️ feat: Add Azure Mistral OCR strategy and endpoint integration
This commit introduces a new OCR strategy named 'azure_mistral_ocr', allowing the use of a Mistral OCR endpoint deployed on Azure. The configuration, schemas, and file upload strategies have been updated to support this integration, enabling seamless OCR processing via Azure-hosted Mistral services.
* 🗑️ chore: Clean up .gitignore by removing commented-out uncommon directory name
* chore: remove unused vars
* refactor: Move createAxiosInstance to packages/api/utils and update imports
- Removed the createAxiosInstance function from the config module and relocated it to a new utils module for better organization.
- Updated import paths in relevant files to reflect the new location of createAxiosInstance.
- Added tests for createAxiosInstance to ensure proper functionality and proxy configuration handling.
* chore: move axios helpers to packages/api
- Added logAxiosError function to @librechat/api for centralized error logging.
- Updated imports across various files to use the new logAxiosError function.
- Removed the old axios.js utility file as it is no longer needed.
* chore: Update Jest moduleNameMapper for improved path resolution
- Added a new mapping for '~/' to resolve module paths in Jest configuration, enhancing import handling for the project.
* feat: Implement Mistral OCR API integration in TS
* chore: Update MistralOCR tests based on new imports
* fix: Enhance MistralOCR configuration handling and tests
- Introduced helper functions for resolving configuration values from environment variables or hardcoded settings.
- Updated the uploadMistralOCR and uploadAzureMistralOCR functions to utilize the new configuration resolution logic.
- Improved test cases to ensure correct behavior when mixing environment variables and hardcoded values.
- Mocked file upload and signed URL responses in tests to validate functionality without external dependencies.
* feat: Enhance MistralOCR functionality with improved configuration and error handling
- Introduced helper functions for loading authentication configuration and resolving values from environment variables.
- Updated uploadMistralOCR and uploadAzureMistralOCR functions to utilize the new configuration logic.
- Added utility functions for processing OCR results and creating error messages.
- Improved document type determination and result aggregation for better OCR processing.
* refactor: Reorganize OCR type imports in Mistral CRUD file
- Moved OCRResult, OCRResultPage, and OCRImage imports to a more logical grouping for better readability and maintainability.
* feat: Add file exports to API and create files index
* chore: Update OCR types for enhanced structure and clarity
- Redesigned OCRImage interface to include mandatory fields and improved naming conventions.
- Added PageDimensions interface for better representation of page metrics.
- Updated OCRResultPage to include dimensions and mandatory images array.
- Refined OCRResult to include document annotation and usage information.
* refactor: use TS counterpart of uploadOCR methods
* ci: Update MistralOCR tests to reflect new OCR result structure
* chore: Bump version of @librechat/api to 1.2.3 in package.json and package-lock.json
* chore: Update CONFIG_VERSION to 1.2.8
* chore: remove unused sendEvent function from config module (now imported from '@librechat/api')
* chore: remove MistralOCR service files and tests (now in '@librechat/api')
* ci: update logger import in ModelService tests to use @librechat/data-schemas
---------
Co-authored-by: arthurolivierfortin <arthurolivier.fortin@gmail.com>