LibreChat

mirror of https://github.com/danny-avila/LibreChat.git synced 2026-04-04 06:47:19 +02:00

Author	SHA1	Message	Date
Danny Avila	b5c097e5c7	⚗️ feat: Agent Context Compaction/Summarization (#12287 ) * chore: imports/types Add summarization config and package-level summarize handler contracts Register summarize handlers across server controller paths Port cursor dual-read/dual-write summary support and UI status handling Selectively merge cursor branch files for BaseClient summary content block detection (last-summary-wins), dual-write persistence, summary block unit tests, and on_summarize_status SSE event handling with started/completed/failed branches. Co-authored-by: Cursor <cursoragent@cursor.com> refactor: type safety feat: add localization for summarization status messages refactor: optimize summary block detection in BaseClient Updated the logic for identifying existing summary content blocks to use a reverse loop for improved efficiency. Added a new test case to ensure the last summary content block is updated correctly when multiple summary blocks exist. chore: add runName to chainOptions in AgentClient refactor: streamline summarization configuration and handler integration Removed the deprecated summarizeNotConfigured function and replaced it with a more flexible createSummarizeFn. Updated the summarization handler setup across various controllers to utilize the new function, enhancing error handling and configuration resolution. Improved overall code clarity and maintainability by consolidating summarization logic. feat(summarization): add staged chunk-and-merge fallback feat(usage): track summarization usage separately from messages feat(summarization): resolve prompt from config in runtime fix(endpoints): use @librechat/api provider config loader refactor(agents): import getProviderConfig from @librechat/api chore: code order feat(app-config): auto-enable summarization when configured feat: summarization config refactor(summarization): streamline persist summary handling and enhance configuration validation Removed the deprecated createDeferredPersistSummary function and integrated a new createPersistSummary function for MongoDB persistence. Updated summarization handlers across various controllers to utilize the new persistence method. Enhanced validation for summarization configuration to ensure provider, model, and prompt are properly set, improving error handling and overall robustness. refactor(summarization): update event handling and remove legacy summarize handlers Replaced the deprecated summarization handlers with new event-driven handlers for summarization start and completion across multiple controllers. This change enhances the clarity of the summarization process and improves the integration of summarization events in the application. Additionally, removed unused summarization functions and streamlined the configuration loading process. refactor(summarization): standardize event names in handlers Updated event names in the summarization handlers to use constants from GraphEvents for consistency and clarity. This change improves maintainability and reduces the risk of errors related to string literals in event handling. feat(summarization): enhance usage tracking for summarization events Added logic to track summarization usage in multiple controllers by checking the current node type. If the node indicates a summarization task, the usage type is set accordingly. This change improves the granularity of usage data collected during summarization processes. feat(summarization): integrate SummarizationConfig into AppSummarizationConfig type Enhanced the AppSummarizationConfig type by extending it with the SummarizationConfig type from librechat-data-provider. This change improves type safety and consistency in the summarization configuration structure. test: add end-to-end tests for summarization functionality Introduced a comprehensive suite of end-to-end tests for the summarization feature, covering the full LibreChat pipeline from message creation to summarization. This includes a new setup file for environment configuration and a Jest configuration specifically for E2E tests. The tests utilize real API keys and ensure proper integration with the summarization process, enhancing overall test coverage and reliability. refactor(summarization): include initial summary in formatAgentMessages output Updated the formatAgentMessages function to return an initial summary alongside messages and index token count map. This change is reflected in multiple controllers and the corresponding tests, enhancing the summarization process by providing additional context for each agent's response. refactor: move hydrateMissingIndexTokenCounts to tokenMap utility Extracted the hydrateMissingIndexTokenCounts function from the AgentClient and related tests into a new tokenMap utility file. This change improves code organization and reusability, allowing for better management of token counting logic across the application. refactor(summarization): standardize step event handling and improve summary rendering Refactored the step event handling in the useStepHandler and related components to utilize constants for event names, enhancing consistency and maintainability. Additionally, improved the rendering logic in the Summary component to conditionally display the summary text based on its availability, providing a better user experience during the summarization process. feat(summarization): introduce baseContextTokens and reserveTokensRatio for improved context management Added baseContextTokens to the InitializedAgent type to calculate the context budget based on agentMaxContextNum and maxOutputTokensNum. Implemented reserveTokensRatio in the createRun function to allow configurable context token management. Updated related tests to validate these changes and ensure proper functionality. feat(summarization): add minReserveTokens, context pruning, and overflow recovery configurations Introduced new configuration options for summarization, including minReserveTokens, context pruning settings, and overflow recovery parameters. Updated the createRun function to accommodate these new options and added a comprehensive test suite to validate their functionality and integration within the summarization process. feat(summarization): add updatePrompt and reserveTokensRatio to summarization configuration Introduced an updatePrompt field for updating existing summaries with new messages, enhancing the flexibility of the summarization process. Additionally, added reserveTokensRatio to the configuration schema, allowing for improved management of token allocation during summarization. Updated related tests to validate these new features. feat(logging): add on_agent_log event handler for structured logging Implemented an on_agent_log event handler in both the agents' callbacks and responses to facilitate structured logging of agent activities. This enhancement allows for better tracking and debugging of agent interactions by logging messages with associated metadata. Updated the summarization process to ensure proper handling of log events. fix: remove duplicate IBalanceUpdate interface declaration perf(usage): single-pass partition of collectedUsage Replace two Array.filter() passes with a single for-of loop that partitions message vs. summarization usages in one iteration. fix(BaseClient): shallow-copy message content before mutating and preserve string content Avoid mutating the original message.content array in-place when appending a summary block. Also convert string content to a text content part instead of silently discarding it. fix(ui): fix Part.tsx indentation and useStepHandler summarize-complete handling - Fix SUMMARY else-if branch indentation in Part.tsx to match chain level - Guard ON_SUMMARIZE_COMPLETE with didFinalize flag to avoid unnecessary re-renders when no summarizing parts exist - Protect against undefined completeData.summary instead of unsafe spread fix(agents): use strict enabled check for summarization handlers Change summarizationConfig?.enabled !== false to === true so handlers are not registered when summarizationConfig is undefined. chore: fix initializeClient JSDoc and move DEFAULT_RESERVE_RATIO to module scope refactor(Summary): align collapse/expand behavior with Reasoning component - Single render path instead of separate streaming vs completed branches - Use useMessageContext for isSubmitting/isLatestMessage awareness so the "Summarizing..." label only shows during active streaming - Default to collapsed (matching Reasoning), user toggles to expand - Add proper aria attributes (aria-hidden, role, aria-controls, contentId) - Hide copy button while actively streaming feat(summarization): default to self-summarize using agent's own provider/model When no summarization config is provided (neither in librechat.yaml nor on the agent), automatically enable summarization using the agent's own provider and model. The agents package already provides default prompts, so no prompt configuration is needed. Also removes the dead resolveSummarizationLLMConfig in summarize.ts (and its spec) — run.ts buildAgentContext is the single source of truth for summarization config resolution. Removes the duplicate RuntimeSummarizationConfig local type in favor of the canonical SummarizationConfig from data-provider. chore: schema and type cleanup for summarization - Add trigger field to summarizationAgentOverrideSchema so per-agent trigger overrides in librechat.yaml are not silently stripped by Zod - Remove unused SummarizationStatus type from runs.ts - Make AppSummarizationConfig.enabled non-optional to reflect the invariant that loadSummarizationConfig always sets it refactor(responses): extract duplicated on_agent_log handler refactor(run): use agents package types for summarization config Import SummarizationConfig, ContextPruningConfig, and OverflowRecoveryConfig from @librechat/agents and use them to type-check the translation layer in buildAgentContext. This ensures the config object passed to the agent graph matches what it expects. - Use `satisfies AgentSummarizationConfig` on the config object - Cast contextPruningConfig and overflowRecoveryConfig to agents types - Properly narrow trigger fields from DeepPartial to required shape feat(config): add maxToolResultChars to base endpoint schema Add maxToolResultChars to baseEndpointSchema so it can be configured on any endpoint in librechat.yaml. Resolved during agent initialization using getProviderConfig's endpoint resolution: custom endpoint config takes precedence, then the provider-specific endpoint config, then the shared `all` config. Passed through to the agents package ToolNode, which uses it to cap tool result length before it enters the context window. When not configured, the agents package computes a sensible default from maxContextTokens. fix(summarization): forward agent model_parameters in self-summarize default When no explicit summarization config exists, the self-summarize default now forwards the agent's model_parameters as the summarization parameters. This ensures provider-specific settings (e.g. Bedrock region, credentials, endpoint host) are available when the agents package constructs the summarization LLM. fix(agents): register summarization handlers by default Change the enabled gate from === true to !== false so handlers register when no explicit summarization config exists. This aligns with the self-summarize default where summarization is always on unless explicitly disabled via enabled: false. refactor(summarization): let agents package inherit clientOptions for self-summarize Remove model_parameters forwarding from the self-summarize default. The agents package now reuses the agent's own clientOptions when the summarization provider matches the agent's provider, inheriting all provider-specific settings (region, credentials, proxy, etc.) automatically. refactor(summarization): use MessageContentComplex[] for summary content Unify summary content to always use MessageContentComplex[] arrays, matching the pattern used by on_message_delta. No more string \| array unions — content is always an array of typed blocks ({ type: 'text', text: '...' } for text, { type: 'reasoning_content', ... } for reasoning). Agents package: - SummaryContentBlock.content: MessageContentComplex[] (was string) - tokenCount now optional (not sent on deltas) - Removed reasoning field — reasoning is now a content block type - streamAndCollect normalizes all chunks to content block arrays - Delta events pass content blocks directly LibreChat: - SummaryContentPart.content: Agents.MessageContentComplex[] - Updated Part.tsx, Summary.tsx, useStepHandler.ts, BaseClient.js - Summary.tsx derives display text from content blocks via useMemo - Aggregator uses simple array spread refactor(summarization): enhance summary handling and text extraction - Updated BaseClient.js to improve summary text extraction, accommodating both legacy and new content formats. - Modified summarization logic to ensure consistent handling of summary content across different message formats. - Adjusted test cases in summarization.e2e.spec.js to utilize the new summary text extraction method. - Refined SSE useStepHandler to initialize summary content as an array. - Updated configuration schema by removing unused minReserveTokens field. - Cleaned up SummaryContentPart type by removing rangeHash property. These changes streamline the summarization process and ensure compatibility with various content structures. refactor(summarization): streamline usage tracking and logging - Removed direct checks for summarization nodes in ModelEndHandler and replaced them with a dedicated markSummarizationUsage function for better readability and maintainability. - Updated OpenAIChatCompletionController and responses handlers to utilize the new markSummarizationUsage function for setting usage types. - Enhanced logging functionality by ensuring the logger correctly handles different log levels. - Introduced a new useCopyToClipboard hook in the Summary component to encapsulate clipboard copy logic, improving code reusability and clarity. These changes improve the overall structure and efficiency of the summarization handling and logging processes. refactor(summarization): update summary content block documentation - Removed outdated comment regarding the last summary content block in BaseClient.js. - Added a new comment to clarify the purpose of the findSummaryContentBlock method, ensuring consistency in documentation. These changes enhance code clarity and maintainability by providing accurate descriptions of the summarization logic. refactor(summarization): update summary content structure in tests - Modified the summarization content structure in e2e tests to use an array format for text, aligning with recent changes in summary handling. - Updated test descriptions to clarify the behavior of context token calculations, ensuring consistency and clarity in the tests. These changes enhance the accuracy and maintainability of the summarization tests by reflecting the updated content structure. refactor(summarization): remove legacy E2E test setup and configuration - Deleted the e2e-setup.js and jest.e2e.config.js files, which contained legacy configurations for E2E tests using real API keys. - Introduced a new summarization.e2e.ts file that implements comprehensive E2E backend integration tests for the summarization process, utilizing real AI providers and tracking summaries throughout the run. These changes streamline the testing framework by consolidating E2E tests into a single, more robust file while removing outdated configurations. refactor(summarization): enhance E2E tests and error handling - Added a cleanup step to force exit after all tests to manage Redis connections. - Updated the summarization model to 'claude-haiku-4-5-20251001' for consistency across tests. - Improved error handling in the processStream function to capture and return processing errors. - Enhanced logging for cross-run tests and tight context scenarios to provide better insights into test execution. These changes improve the reliability and clarity of the E2E tests for the summarization process. refactor(summarization): enhance test coverage for maxContextTokens behavior - Updated run-summarization.test.ts to include a new test case ensuring that maxContextTokens does not exceed user-defined limits, even when calculated ratios suggest otherwise. - Modified summarization.e2e.ts to replace legacy UsageMetadata type with a more appropriate type for collectedUsage, improving type safety and clarity in the test setup. These changes improve the robustness of the summarization tests by validating context token constraints and refining type definitions. feat(summarization): add comprehensive E2E tests for summarization process - Introduced a new summarization.e2e.test.ts file that implements extensive end-to-end integration tests for the summarization pipeline, covering the full flow from LibreChat to agents. - The tests utilize real AI providers and include functionality to track summaries during and between runs. - Added necessary cleanup steps to manage Redis connections post-tests and ensure proper exit. These changes enhance the testing framework by providing robust coverage for the summarization process, ensuring reliability and performance under real-world conditions. fix(service): import logger from winston configuration - Removed the import statement for logger from '@librechat/data-schemas' and replaced it with an import from '~/config/winston'. - This change ensures that the logger is correctly sourced from the updated configuration, improving consistency in logging practices across the application. refactor(summary): simplify Summary component and enhance token display - Removed the unused `meta` prop from the `SummaryButton` component to streamline its interface. - Updated the token display logic to use a localized string for better internationalization support. - Adjusted the rendering of the `meta` information to improve its visibility within the `Summary` component. These changes enhance the clarity and usability of the Summary component while ensuring better localization practices. feat(summarization): add maxInputTokens configuration for summarization - Introduced a new `maxInputTokens` property in the summarization configuration schema to control the amount of conversation context sent to the summarizer, with a default value of 10000. - Updated the `createRun` function to utilize the new `maxInputTokens` setting, allowing for more flexible summarization based on agent context. These changes enhance the summarization capabilities by providing better control over input token limits, improving the overall summarization process. refactor(summarization): simplify maxInputTokens logic in createRun function - Updated the logic for the `maxInputTokens` property in the `createRun` function to directly use the agent's base context tokens when the resolved summarization configuration does not specify a value. - This change streamlines the configuration process and enhances clarity in how input token limits are determined for summarization. These modifications improve the maintainability of the summarization configuration by reducing complexity in the token calculation logic. feat(summary): enhance Summary component to display meta information - Updated the SummaryContent component to accept an optional `meta` prop, allowing for additional contextual information to be displayed above the main content. - Adjusted the rendering logic in the Summary component to utilize the new `meta` prop, improving the visibility of supplementary details. These changes enhance the user experience by providing more context within the Summary component, making it clearer and more informative. refactor(summarization): standardize reserveRatio configuration in summarization logic - Replaced instances of `reserveTokensRatio` with `reserveRatio` in the `createRun` function and related tests to unify the terminology across the codebase. - Updated the summarization configuration schema to reflect this change, ensuring consistency in how the reserve ratio is defined and utilized. - Removed the per-agent override logic for summarization configuration, simplifying the overall structure and enhancing clarity. These modifications improve the maintainability and readability of the summarization logic by standardizing the configuration parameters. * fix: circular dependency of `~/models` * chore: update logging scope in agent log handlers Changed log scope from `[agentus:${data.scope}]` to `[agents:${data.scope}]` in both the callbacks and responses controllers to ensure consistent logging format across the application. * feat: calibration ratio * refactor(tests): update summarizationConfig tests to reflect changes in enabled property Modified tests to check for the new `summarizationEnabled` property instead of the deprecated `enabled` field in the summarization configuration. This change ensures that the tests accurately validate the current configuration structure and behavior of the agents. * feat(tests): add markSummarizationUsage mock for improved test coverage Introduced a mock for the markSummarizationUsage function in the responses unit tests to enhance the testing of summarization usage tracking. This addition supports better validation of summarization-related functionalities and ensures comprehensive test coverage for the agents' response handling. * refactor(tests): simplify event handler setup in createResponse tests Removed redundant mock implementations for event handlers in the createResponse unit tests, streamlining the setup process. This change enhances test clarity and maintainability while ensuring that the tests continue to validate the correct behavior of usage tracking during on_chat_model_end events. * refactor(agents): move calibration ratio capture to finally block Reorganized the logic for capturing the calibration ratio in the AgentClient class to ensure it is executed in the finally block. This change guarantees that the ratio is captured even if the run is aborted, enhancing the reliability of the response message persistence. Removed redundant code and improved clarity in the handling of context metadata. * refactor(agents): streamline bulk write logic in recordCollectedUsage function Removed redundant bulk write operations and consolidated document handling in the recordCollectedUsage function. The logic now combines all documents into a single bulk write operation, improving efficiency and reducing error handling complexity. Updated logging to provide consistent error messages for bulk write failures. * refactor(agents): enhance summarization configuration resolution in createRun function Streamlined the summarization configuration logic by introducing a base configuration and allowing for overrides from agent-specific settings. This change improves clarity and maintainability, ensuring that the summarization configuration is consistently applied while retaining flexibility for customization. Updated the handling of summarization parameters to ensure proper integration with the agent's model and provider settings. * refactor(agents): remove unused tokenCountMap and streamline calibration ratio handling Eliminated the unused tokenCountMap variable from the AgentClient class to enhance code clarity. Additionally, streamlined the logic for capturing the calibration ratio by using optional chaining and a fallback value, ensuring that context metadata is consistently defined. This change improves maintainability and reduces potential confusion in the codebase. * refactor(agents): extract agent log handler for improved clarity and reusability Refactored the agent log handling logic by extracting it into a dedicated function, `agentLogHandler`, enhancing code clarity and reusability across different modules. Updated the event handlers in both the OpenAI and responses controllers to utilize the new handler, ensuring consistent logging behavior throughout the application. * test: add summarization event tests for useStepHandler Implemented a series of tests for the summarization events in the useStepHandler hook. The tests cover scenarios for ON_SUMMARIZE_START, ON_SUMMARIZE_DELTA, and ON_SUMMARIZE_COMPLETE events, ensuring proper handling of summarization logic, including message accumulation and finalization. This addition enhances test coverage and validates the correct behavior of the summarization process within the application. * refactor(config): update summarizationTriggerSchema to use enum for type validation Changed the type of the `type` field in the summarizationTriggerSchema from a string to an enum with a single value 'token_count'. This modification enhances type safety and ensures that only valid types are accepted in the configuration, improving overall clarity and maintainability of the schema. * test(usage): add bulk write tests for message and summarization usage Implemented tests for the bulk write functionality in the recordCollectedUsage function, covering scenarios for combined message and summarization usage, summarization-only usage, and message-only usage. These tests ensure correct document handling and token rollup calculations, enhancing test coverage and validating the behavior of the usage tracking logic. * refactor(Chat): enhance clipboard copy functionality and type definitions in Summary component Updated the Summary component to improve the clipboard copy functionality by handling clipboard permission errors. Refactored type definitions for SummaryProps to use a more specific type, enhancing type safety. Adjusted the SummaryButton and FloatingSummaryBar components to accept isCopied and onCopy props, promoting better separation of concerns and reusability. * chore(translations): remove unused "Expand Summary" key from English translations Deleted the "Expand Summary" key from the English translation file to streamline the localization resources and improve clarity in the user interface. This change helps maintain an organized and efficient translation structure. * refactor: adjust token counting for Claude model to account for API discrepancies Implemented a correction factor for token counting when using the Claude model, addressing discrepancies between Anthropic's API and local tokenizer results. This change ensures accurate token counts by applying a scaling factor, improving the reliability of token-related functionalities. * refactor(agents): implement token count adjustment for Claude model messages Added a method to adjust token counts for messages processed by the Claude model, applying a correction factor to align with API expectations. This enhancement improves the accuracy of token counting, ensuring reliable functionality when interacting with the Claude model. * refactor(agents): token counting for media content in messages Introduced a new method to estimate token costs for image and document blocks in messages, improving the accuracy of token counting. This enhancement ensures that media content is properly accounted for, particularly for the Claude model, by integrating additional token estimation logic for various content types. Updated the token counting function to utilize this new method, enhancing overall reliability and functionality. * chore: fix missing import * fix(agents): clamp baseContextTokens and document reserve ratio change Prevent negative baseContextTokens when maxOutputTokens exceeds the context window (misconfigured models). Document the 10%→5% default reserve ratio reduction introduced alongside summarization. * fix(agents): include media tokens in hydrated token counts Add estimateMediaTokensForMessage to createTokenCounter so the hydration path (used by hydrateMissingIndexTokenCounts) matches the precomputed path in AgentClient.getTokenCountForMessage. Without this, messages containing images or documents were systematically undercounted during hydration, risking context window overflow. Add 34 unit tests covering all block-type branches of estimateMediaTokensForMessage. * fix(agents): include summarization output tokens in usage return value The returned output_tokens from recordCollectedUsage now reflects all billed LLM calls (message + summarization). Previously, summarization completions were billed but excluded from the returned metadata, causing a discrepancy between what users were charged and what the response message reported. * fix(tests): replace process.exit with proper Redis cleanup in e2e test The summarization E2E test used process.exit(0) to work around a Redis connection opened at import time, which killed the Jest runner and bypassed teardown. Use ioredisClient.quit() and keyvRedisClient.disconnect() for graceful cleanup instead. * fix(tests): update getConvo imports in OpenAI and response tests Refactor test files to import getConvo from the main models module instead of the Conversation submodule. This change ensures consistency across tests and simplifies the import structure, enhancing maintainability. * fix(clients): improve summary text validation in BaseClient Refactor the summary extraction logic to ensure that only non-empty summary texts are considered valid. This change enhances the robustness of the message processing by utilizing a dedicated method for summary text retrieval, improving overall reliability. * fix(config): replace z.any() with explicit union in summarization schema Model parameters (temperature, top_p, etc.) are constrained to primitive types rather than the policy-violating z.any(). * refactor(agents): deduplicate CLAUDE_TOKEN_CORRECTION constant Export from the TS source in packages/api and import in the JS client, eliminating the static class property that could drift out of sync. * refactor(agents): eliminate duplicate selfProvider in buildAgentContext selfProvider and provider were derived from the same expression with different type casts. Consolidated to a single provider variable. * refactor(agents): extract shared SSE handlers and restrict log levels - buildSummarizationHandlers() factory replaces triplicated handler blocks across responses.js and openai.js - agentLogHandlerObj exported from callbacks.js for consistent reuse - agentLogHandler restricted to an allowlist of safe log levels (debug, info, warn, error) instead of accepting arbitrary strings * fix(SSE): batch summarize deltas, add exhaustiveness check, conditional error announcement - ON_SUMMARIZE_DELTA coalesces rapid-fire renders via requestAnimationFrame instead of calling setMessages per chunk - Exhaustive never-check on TStepEvent catches unhandled variants at compile time when new StepEvents are added - ON_SUMMARIZE_COMPLETE error announcement only fires when a summary part was actually present and removed * feat(agents): persist instruction overhead in contextMeta and seed across runs Extend contextMeta with instructionOverhead and toolCount so the provider-observed instruction overhead is persisted on the response message and seeded into the pruner on subsequent runs. This enables the pruner to use a calibrated budget from the first call instead of waiting for a provider observation, preventing the ratio collapse caused by local tokenizer overestimating tool schema tokens. The seeded overhead is only used when encoding and tool count match between runs, ensuring stale values from different configurations are discarded. * test(agents): enhance OpenAI test mocks for summarization handlers Updated the OpenAI test suite to include additional mock implementations for summarization handlers, including buildSummarizationHandlers, markSummarizationUsage, and agentLogHandlerObj. This improves test coverage and ensures consistent behavior during testing. * fix(agents): address review findings for summarization v2 Cancel rAF on unmount to prevent stale Recoil writes from dead component context. Clear orphaned summarizing:true parts when ON_SUMMARIZE_COMPLETE arrives without a summary payload. Add null guard and safe spread to agentLogHandler. Handle Anthropic-format base64 image/* documents in estimateMediaTokensForMessage. Use role="region" for expandable summary content. Add .describe() to contextMeta Zod fields. Extract duplicate usage loop into helper. * refactor: simplify contextMeta to calibrationRatio + encoding only Remove instructionOverhead and toolCount from cross-run persistence — instruction tokens change too frequently between runs (prompt edits, tool changes) for a persisted seed to be reliable. The intra-run calibration in the pruner still self-corrects via provider observations. contextMeta now stores only the tokenizer-bias ratio and encoding, which are stable across instruction changes. * test(SSE): enhance useStepHandler tests for ON_SUMMARIZE_COMPLETE behavior Updated the test for ON_SUMMARIZE_COMPLETE to clarify that it finalizes the existing part with summarizing set to false when the summary is undefined. Added assertions to verify the correct behavior of message updates and the state of summary parts. * refactor(BaseClient): remove handleContextStrategy and truncateToolCallOutputs functions Eliminated the handleContextStrategy method from BaseClient to streamline message handling. Also removed the truncateToolCallOutputs function from the prompts module, simplifying the codebase and improving maintainability. * refactor: add AGENT_DEBUG_LOGGING option and refactor token count handling in BaseClient Introduced AGENT_DEBUG_LOGGING to .env.example for enhanced debugging capabilities. Refactored token count handling in BaseClient by removing the handleTokenCountMap method and simplifying token count updates. Updated AgentClient to log detailed token count recalculations and adjustments, improving traceability during message processing. * chore: update dependencies in package-lock.json and package.json files Bumped versions of several dependencies, including @librechat/agents to ^3.1.62 and various AWS SDK packages to their latest versions. This ensures compatibility and incorporates the latest features and fixes. * chore: imports order * refactor: extract summarization config resolution from buildAgentContext * refactor: rename and simplify summarization configuration shaping function * refactor: replace AgentClient token counting methods with single-pass pure utility Extract getTokenCount() and getTokenCountForMessage() from AgentClient into countFormattedMessageTokens(), a pure function in packages/api that handles text, tool_call, image, and document content types in one loop. - Decompose estimateMediaTokensForMessage into block-level helpers (estimateImageDataTokens, estimateImageBlockTokens, estimateDocumentBlockTokens) shared by both estimateMediaTokensForMessage and the new single-pass function - Remove redundant per-call getEncoding() resolution (closure captures once) - Remove deprecated gpt-3.5-turbo-0301 model branching - Drop this.getTokenCount guard from BaseClient.sendMessage * refactor: streamline token counting in createTokenCounter function Simplified the createTokenCounter function by removing the media token estimation and directly calculating the token count. This change enhances clarity and performance by consolidating the token counting logic into a single pass, while maintaining compatibility with Claude's token correction. * refactor: simplify summarization configuration types Removed the AppSummarizationConfig type and directly used SummarizationConfig in the AppConfig interface. This change streamlines the type definitions and enhances consistency across the codebase. * chore: import order * fix: summarization event handling in useStepHandler - Cancel pending summarizeDeltaRaf in clearStepMaps to prevent stale frames firing after map reset or component unmount - Move announcePolite('summarize_completed') inside the didFinalize guard so screen readers only announce when finalization actually occurs - Remove dead cleanup closure returned from stepHandler useCallback body that was never invoked by any caller * fix: estimate tokens for non-PDF/non-image base64 document blocks Previously estimateDocumentBlockTokens returned 0 for unrecognized MIME types (e.g. text/plain, application/json), silently underestimating context budget. Fall back to character-based heuristic or countTokens. * refactor: return cloned usage from markSummarizationUsage Avoid mutating LangChain's internal usage_metadata object by returning a shallow clone with the usage_type tag. Update all call sites in callbacks, openai, and responses controllers to use the returned value. * refactor: consolidate debug logging loops in buildMessages Merge the two sequential O(n) debug-logging passes over orderedMessages into a single pass inside the map callback where all data is available. * refactor: narrow SummaryContentPart.content type Replace broad Agents.MessageContentComplex[] with the specific Array<{ type: ContentTypes.TEXT; text: string }> that all producers and consumers already use, improving compile-time safety. * refactor: use single output array in recordCollectedUsage Have processUsageGroup append to a shared array instead of returning separate arrays that are spread into a third, reducing allocations. * refactor: use for...in in hydrateMissingIndexTokenCounts Replace Object.entries with for...in to avoid allocating an intermediate tuple array during token map hydration.	2026-03-21 14:28:56 -04:00
Danny Avila	8ba2bde5c1	📦 refactor: Consolidate DB models, encapsulating Mongoose usage in `data-schemas` (#11830 ) * chore: move database model methods to /packages/data-schemas * chore: add TypeScript ESLint rule to warn on unused variables * refactor: model imports to streamline access - Consolidated model imports across various files to improve code organization and reduce redundancy. - Updated imports for models such as Assistant, Message, Conversation, and others to a unified import path. - Adjusted middleware and service files to reflect the new import structure, ensuring functionality remains intact. - Enhanced test files to align with the new import paths, maintaining test coverage and integrity. * chore: migrate database models to packages/data-schemas and refactor all direct Mongoose Model usage outside of data-schemas * test: update agent model mocks in unit tests - Added `getAgent` mock to `client.test.js` to enhance test coverage for agent-related functionality. - Removed redundant `getAgent` and `getAgents` mocks from `openai.spec.js` and `responses.unit.spec.js` to streamline test setup and reduce duplication. - Ensured consistency in agent mock implementations across test files. * fix: update types in data-schemas * refactor: enhance type definitions in transaction and spending methods - Updated type definitions in `checkBalance.ts` to use specific request and response types. - Refined `spendTokens.ts` to utilize a new `SpendTxData` interface for better clarity and type safety. - Improved transaction handling in `transaction.ts` by introducing `TransactionResult` and `TxData` interfaces, ensuring consistent data structures across methods. - Adjusted unit tests in `transaction.spec.ts` to accommodate new type definitions and enhance robustness. * refactor: streamline model imports and enhance code organization - Consolidated model imports across various controllers and services to a unified import path, improving code clarity and reducing redundancy. - Updated multiple files to reflect the new import structure, ensuring all functionalities remain intact. - Enhanced overall code organization by removing duplicate import statements and optimizing the usage of model methods. * feat: implement loadAddedAgent and refactor agent loading logic - Introduced `loadAddedAgent` function to handle loading agents from added conversations, supporting multi-convo parallel execution. - Created a new `load.ts` file to encapsulate agent loading functionalities, including `loadEphemeralAgent` and `loadAgent`. - Updated the `index.ts` file to export the new `load` module instead of the deprecated `loadAgent`. - Enhanced type definitions and improved error handling in the agent loading process. - Adjusted unit tests to reflect changes in the agent loading structure and ensure comprehensive coverage. * refactor: enhance balance handling with new update interface - Introduced `IBalanceUpdate` interface to streamline balance update operations across the codebase. - Updated `upsertBalanceFields` method signatures in `balance.ts`, `transaction.ts`, and related tests to utilize the new interface for improved type safety. - Adjusted type imports in `balance.spec.ts` to include `IBalanceUpdate`, ensuring consistency in balance management functionalities. - Enhanced overall code clarity and maintainability by refining type definitions related to balance operations. * feat: add unit tests for loadAgent functionality and enhance agent loading logic - Introduced comprehensive unit tests for the `loadAgent` function, covering various scenarios including null and empty agent IDs, loading of ephemeral agents, and permission checks. - Enhanced the `initializeClient` function by moving `getConvoFiles` to the correct position in the database method exports, ensuring proper functionality. - Improved test coverage for agent loading, including handling of non-existent agents and user permissions. * chore: reorder memory method exports for consistency - Moved `deleteAllUserMemories` to the correct position in the exported memory methods, ensuring a consistent and logical order of method exports in `memory.ts`.	2026-03-21 14:28:53 -04:00
Danny Avila	365a0dc0f6	🩺 refactor: Surface Descriptive OCR Error Messages to Client (#12344 ) * fix: pass along error message when OCR fails Right now, if OCR fails, it just says "Error processing file" which isn't very helpful. The `error.message` does has helpful information in it, but our filter wasn't including the right case to pass it along. Now it does! * fix: extract shared upload error filter, apply to images route The 'Unable to extract text from' error was only allowlisted in the files route but not the images route, which also calls processAgentFileUpload. Extract the duplicated error filter logic into a shared resolveUploadErrorMessage utility in packages/api so both routes stay in sync. --------- Co-authored-by: Dan Lew <daniel@mightyacorn.com>	2026-03-20 17:10:25 -04:00
mfish911	4e5ae28fa9	📡 feat: Support Unauthenticated SMTP Relays (#12322 ) * allow smtp server that does not have authentication * fix: align checkEmailConfig with optional SMTP credentials and add tests Remove EMAIL_USERNAME/EMAIL_PASSWORD requirements from the hasSMTPConfig predicate in checkEmailConfig() so the rest of the codebase (login, startup checks, invite-user) correctly recognizes unauthenticated SMTP as a valid email configuration. Add a warning when only one of the two credential env vars is set, in both sendEmail.js and checkEmailConfig(), to catch partial misconfigurations early. Add test coverage for both the transporter auth assembly in sendEmail.js and the checkEmailConfig predicate in packages/api. Document in .env.example that credentials are optional for unauthenticated SMTP relays. --------- Co-authored-by: Danny Avila <danny@librechat.ai>	2026-03-20 13:07:39 -04:00
Danny Avila	594d9470d5	🪤 fix: Avoid express-rate-limit v8 ERR_ERL_KEY_GEN_IPV6 False Positive (#12333 ) * fix: avoid express-rate-limit v8 ERR_ERL_KEY_GEN_IPV6 false positive express-rate-limit v8 calls keyGenerator.toString() and throws ERR_ERL_KEY_GEN_IPV6 if the source contains the literal substring "req.ip" without "ipKeyGenerator". When packages/api compiles req?.ip to older JS targets, the output contains "req.ip", triggering the heuristic. Bracket notation (req?.['ip']) produces identical runtime behavior but never emits the literal "req.ip" substring regardless of compilation target. Closes #12321 * fix: add toString regression test and clean up redundant annotation Add a test that verifies removePorts.toString() does not contain "req.ip", guarding against reintroduction of the ERR_ERL_KEY_GEN_IPV6 false positive. Fix a misleading test description and remove a redundant type annotation on a trivially-inferred local.	2026-03-20 12:32:55 -04:00
Brad Russell	ecd6d76bc8	🚦 fix: ERR_ERL_INVALID_IP_ADDRESS and IPv6 Key Collisions in IP Rate Limiters (#12319 ) * fix: Add removePorts keyGenerator to all IP-based rate limiters Six IP-based rate limiters are missing the `keyGenerator: removePorts` option that is already used by the auth-related limiters (login, register, resetPassword, verifyEmail). Without it, reverse proxies that include ports in X-Forwarded-For headers cause ERR_ERL_INVALID_IP_ADDRESS errors from express-rate-limit. Fixes #12318 * fix: make removePorts IPv6-safe to prevent rate-limit key collisions The original regex `/:\d+[^:]$/` treated the last colon-delimited segment of bare IPv6 addresses as a port, mangling valid IPs (e.g. `::1` → `::`, `2001:db8::1` → `2001:db8::`). Distinct IPv6 clients could collapse into the same rate-limit bucket. Use `net.isIP()` as a fast path for already-valid IPs, then match bracketed IPv6+port and IPv4+port explicitly. Bare IPv6 addresses are now returned unchanged. Also fixes pre-existing property ordering inconsistency in ttsLimiters.js userLimiterOptions (keyGenerator before store). refactor: move removePorts to packages/api as TypeScript, fix import order - Move removePorts implementation to packages/api/src/utils/removePorts.ts with proper Express Request typing - Reduce api/server/utils/removePorts.js to a thin re-export from @librechat/api for backward compatibility - Consolidate removePorts import with limiterCache from @librechat/api in all 6 limiter files, fixing import order (package imports shortest to longest, local imports longest to shortest) - Remove narrating inline comments per code style guidelines --------- Co-authored-by: Danny Avila <danny@librechat.ai>	2026-03-19 21:48:03 -04:00
Danny Avila	39f5f83a8a	🔌 fix: Isolate Code-Server HTTP Agents to Prevent Socket Pool Contamination (#12311 ) * 🔧 fix: Isolate HTTP agents for code-server axios requests Prevents socket hang up after 5s on Node 19+ when code executor has file attachments. follow-redirects (axios dep) leaks `socket.destroy` as a timeout listener on TCP sockets; with Node 19+ defaulting to keepAlive: true, tainted sockets re-enter the global pool and destroy active node-fetch requests in CodeExecutor after the idle timeout. Uses dedicated http/https agents with keepAlive: false for all axios calls targeting CODE_BASEURL in crud.js and process.js. Closes #12298 * ♻️ refactor: Extract code-server HTTP agents to shared module - Move duplicated agent construction from crud.js and process.js into a shared agents.js module to eliminate DRY violation - Switch process.js from raw `require('axios')` to `createAxiosInstance()` for proxy configuration parity with crud.js - Fix import ordering in process.js (agent constants no longer split imports) - Add 120s timeout to uploadCodeEnvFile (was the only code-server call without a timeout) * ✅ test: Add regression tests for code-server socket isolation - Add crud.spec.js covering getCodeOutputDownloadStream and uploadCodeEnvFile (agent options, timeout, URL, error handling) - Add socket pool isolation tests to process.spec.js asserting keepAlive:false agents are forwarded to axios - Update process.spec.js mocks for createAxiosInstance() migration * ♻️ refactor: Move code-server agents to packages/api Relocate agents.js from api/server/services/Files/Code/ to packages/api/src/utils/code.ts per workspace conventions. Consumers now import codeServerHttpAgent/codeServerHttpsAgent from @librechat/api.	2026-03-19 16:16:57 -04:00
Danny Avila	951d261f5c	🧯 fix: Prevent Env-Variable Exfil. via Placeholder Injection (#12260 ) * 🔒 fix: Resolve env vars before body placeholder expansion to prevent secret exfiltration Body placeholders ({{LIBRECHAT_BODY_}}) were substituted before extractEnvVariable ran, allowing user-controlled body fields containing ${SECRET} patterns to be expanded into real environment values in outbound headers. Reorder so env vars resolve first, preventing untrusted input from triggering env expansion. 🛡️ fix: Block sensitive infrastructure env vars from placeholder resolution Add isSensitiveEnvVar blocklist to extractEnvVariable so that internal infrastructure secrets (JWT_SECRET, JWT_REFRESH_SECRET, CREDS_KEY, CREDS_IV, MEILI_MASTER_KEY, MONGO_URI, REDIS_URI, REDIS_PASSWORD) can never be resolved via ${VAR} expansion — even if an attacker manages to inject a placeholder pattern. Uses exact-match set (not substring patterns) to avoid breaking legitimate operator config that references OAuth/API secrets in MCP and custom endpoint configurations. * 🧹 test: Rename ANOTHER_SECRET test fixture to ANOTHER_VALUE Avoid using SECRET-containing names for non-sensitive test fixtures to prevent confusion with the new isSensitiveEnvVar blocklist. * 🔒 fix: Resolve env vars before all user-controlled substitutions in processSingleValue Move extractEnvVariable to run on the raw admin-authored template BEFORE customUserVars, user fields, OIDC tokens, and body placeholders. Previously env resolution ran after customUserVars, so a user setting a custom MCP variable to "${SECRET}" could still trigger env expansion. Now env vars are resolved strictly on operator config, and all subsequent user-controlled substitutions cannot introduce ${VAR} patterns that would be expanded. Gated by !dbSourced so DB-stored servers continue to skip env resolution. Adds a security-invariant comment documenting the ordering requirement. * 🧪 test: Comprehensive security regression tests for placeholder injection - Cover all three body fields (conversationId, parentMessageId, messageId) - Add user-field injection test (user.name containing ${VAR}) - Add customUserVars injection test (MY_TOKEN = "${VAR}") - Add processMCPEnv injection tests for body and customUserVars paths - Remove redundant process.env setup/teardown already handled by beforeEach/afterEach * 🧹 chore: Add REDIS_PASSWORD to blocklist integration test; document customUserVars gate	2026-03-16 08:48:24 -04:00
Danny Avila	a01959b3d2	🛰️ fix: Cross-Replica Created Event Delivery (#12231 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details * fix: emit created event from metadata on cross-replica subscribe In multi-instance Redis deployments, the created event (which triggers sidebar conversation creation) was lost when the SSE subscriber connected to a different instance than the one generating. The event was only in the generating instance's local earlyEventBuffer and the Redis pub/sub message was already gone by the time the subscriber's channel was active. When subscribing cross-replica (empty buffer, Redis mode, userMessage already in job metadata), reconstruct and emit the created event directly from stored metadata. * test: add skipBufferReplay regression guard for cross-replica created event Add test asserting the resume path (skipBufferReplay: true) does NOT emit a created event on cross-replica subscribe — prevents the duplication fix from PR #12225 from regressing. Add explanatory JSDoc on the cross-replica fallback branch documenting which fields are preserved from trackUserMessage() and why sender/isCreatedByUser are hardcoded. * refactor: replace as-unknown-as casts with discriminated ServerSentEvent union Split ServerSentEvent into StreamEvent \| CreatedEvent \| FinalEvent so event shapes are statically typed. Removes all as-unknown-as casts in GenerationJobManager and test file; narrows with proper union members where properties are accessed. * fix: await trackUserMessage before PUBLISH for structural ordering trackUserMessage was fire-and-forget — the HSET for userMessage could theoretically race with the PUBLISH. Await it so the write commits before the pub/sub fires, guaranteeing any cross-replica getJob() after the pub/sub window always finds userMessage in Redis. No-op for non-created events (early return before any async work). * refactor: type CreatedEvent.message explicitly, fix JSDoc and import Give CreatedEvent.message its full known shape instead of Record<string, unknown>. Update sendEvent JSDoc to reflect the discriminated union. Use barrel import in test file. * refactor: type FinalEvent fields with explicit message and conversation shapes Replace Record<string, unknown> on requestMessage, responseMessage, conversation, and runMessages with FinalMessageFields and a typed conversation shape. Captures the known field set used by all final event constructors (abort handler in GenerationJobManager and normal completion in request.js) while allowing extension via index signature for fields contributed by the full TMessage/TConversation schemas. * refactor: narrow trackUserMessage with discriminated union, disambiguate error fields Use 'created' in event to narrow ServerSentEvent to CreatedEvent, eliminating all Record<string, unknown> casts and manual field assertions. Add JSDoc to the two distinct error fields on FinalMessageFields and FinalEvent to prevent confusion. * fix: update cross-replica test to expect created event from metadata The cross-replica subscribe fallback now correctly emits a created event reconstructed from persisted metadata when userMessage exists in the Redis job hash. Replica B receives 4 events (created + 3 deltas) instead of 3.	2026-03-15 11:11:10 -04:00
Danny Avila	35a35dc2e9	📏 refactor: Add File Size Limits to Conversation Imports (#12221 ) * fix: add file size limits to conversation import multer instance * fix: address review findings for conversation import file size limits * fix: use local jest.mock for data-schemas instead of global moduleNameMapper The global @librechat/data-schemas mock in jest.config.js only provided logger, breaking all tests that depend on createModels from the same package. Replace with a virtual jest.mock scoped to the import spec file. * fix: move import to top of file, pre-compute upload middleware, assert logger.warn in tests * refactor: move resolveImportMaxFileSize to packages/api New backend logic belongs in packages/api as TypeScript. Delete the api/server/utils/import/limits.js wrapper and import directly from @librechat/api in convos.js and importConversations.js. Resolver unit tests move to packages/api; the api/ spec retains only multer behavior tests. * chore: rename importLimits to import * fix: stale type reference and mock isolation in import tests Update typeof import path from '../importLimits' to '../import' after the rename. Clear mockLogger.warn in beforeEach to prevent cross-test accumulation. * fix: add resolveImportMaxFileSize to @librechat/api mock in convos.spec.js * fix: resolve jest.mock hoisting issue in import tests jest.mock factories are hoisted above const declarations, so the mockLogger reference was undefined at factory evaluation time. Use a direct import of the mocked logger module instead. * fix: remove virtual flag from data-schemas mock for CI compatibility virtual: true prevents the mock from intercepting the real module in CI where @librechat/data-schemas is built, causing import.ts to use the real logger while the test asserts against the mock.	2026-03-14 03:06:29 -04:00
Danny Avila	9a5d7eaa4e	⚡ refactor: Replace `tiktoken` with `ai-tokenizer` (#12175 ) Some checks failed Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Has been cancelled Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Has been cancelled Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Has been cancelled Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Has been cancelled Details * chore: Update dependencies by adding ai-tokenizer and removing tiktoken - Added ai-tokenizer version 1.0.6 to package.json and package-lock.json across multiple packages. - Removed tiktoken version 1.0.15 from package.json and package-lock.json in the same locations, streamlining dependency management. * refactor: replace js-tiktoken with ai-tokenizer - Added support for 'claude' encoding in the AgentClient class to improve model compatibility. - Updated Tokenizer class to utilize 'ai-tokenizer' for both 'o200k_base' and 'claude' encodings, replacing the previous 'tiktoken' dependency. - Refactored tests to reflect changes in tokenizer behavior and ensure accurate token counting for both encoding types. - Removed deprecated references to 'tiktoken' and adjusted related tests for improved clarity and functionality. * chore: remove tiktoken mocks from DALLE3 tests - Eliminated mock implementations of 'tiktoken' from DALLE3-related test files to streamline test setup and align with recent dependency updates. - Adjusted related test structures to ensure compatibility with the new tokenizer implementation. * chore: Add distinct encoding support for Anthropic Claude models - Introduced a new method `getEncoding` in the AgentClient class to handle the specific BPE tokenizer for Claude models, ensuring compatibility with the distinct encoding requirements. - Updated documentation to clarify the encoding logic for Claude and other models. * docs: Update return type documentation for getEncoding method in AgentClient - Clarified the return type of the getEncoding method to specify that it can return an EncodingName or undefined, enhancing code readability and type safety. * refactor: Tokenizer class and error handling - Exported the EncodingName type for broader usage. - Renamed encodingMap to encodingData for clarity. - Improved error handling in getTokenCount method to ensure recovery attempts are logged and return 0 on failure. - Updated countTokens function documentation to specify the use of 'o200k_base' encoding. * refactor: Simplify encoding documentation and export type - Updated the getEncoding method documentation to clarify the default behavior for non-Anthropic Claude models. - Exported the EncodingName type separately from the Tokenizer module for improved clarity and usage. * test: Update text processing tests for token limits - Adjusted test cases to handle smaller text sizes, changing scenarios from ~120k tokens to ~20k tokens for both the real tokenizer and countTokens functions. - Updated token limits in tests to reflect new constraints, ensuring tests accurately assess performance and call reduction. - Enhanced console log messages for clarity regarding token counts and reductions in the updated scenarios. * refactor: Update Tokenizer imports and exports - Moved Tokenizer and countTokens exports to the tokenizer module for better organization. - Adjusted imports in memory.ts to reflect the new structure, ensuring consistent usage across the codebase. - Updated memory.test.ts to mock the Tokenizer from the correct module path, enhancing test accuracy. * refactor: Tokenizer initialization and error handling - Introduced an async `initEncoding` method to preload tokenizers, improving performance and accuracy in token counting. - Updated `getTokenCount` to handle uninitialized tokenizers more gracefully, ensuring proper recovery and logging on errors. - Removed deprecated synchronous tokenizer retrieval, streamlining the overall tokenizer management process. * test: Enhance tokenizer tests with initialization and encoding checks - Added `beforeAll` hooks to initialize tokenizers for 'o200k_base' and 'claude' encodings before running tests, ensuring proper setup. - Updated tests to validate the loading of encodings and the correctness of token counts for both 'o200k_base' and 'claude'. - Improved test structure to deduplicate concurrent initialization calls, enhancing performance and reliability.	2026-03-10 23:14:52 -04:00
Danny Avila	a79f7cebd5	🤖 feat: GPT-5.4 and GPT-5.4-pro Context + Pricing (#12099 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions Details * ✨ feat: Add support for new GPT-5.4 and GPT-5.4-pro models - Introduced new token values and cache settings for 'gpt-5.4' and 'gpt-5.4-pro' in the API model configurations. - Updated maximum output limits for the new models in the tokens utility. - Included 'gpt-5.4' and 'gpt-5.4-pro' in the shared OpenAI models list for consistent access across the application. * 🔧 update: Enhance GPT-5.4 and GPT-5.4-pro model configurations - Refined token pricing and cache settings for 'gpt-5.4' and 'gpt-5.4-pro' in the API model configurations. - Added tests for cache multipliers and maximum token limits for the new models. - Updated shared OpenAI models list to include 'gpt-5.4-thinking' and added a note for verifying pricing before release. * 🔧 update: Add clarification to token pricing for 'gpt-5.4-pro' - Added a comment to the 'gpt-5.4-pro' model configuration in tokens.ts to specify that it shares the same token window as 'gpt-5.4', enhancing clarity for future reference.	2026-03-06 02:11:01 -05:00
Danny Avila	956f8fb6f0	🏆 fix: Longest-or-Exact-Key Match in `findMatchingPattern`, Remove Deprecated Models (#12073 ) * 🔧 fix: Use longest-match in findMatchingPattern, remove deprecated PaLM2/Codey models findMatchingPattern now selects the longest matching key instead of the first reverse-order match, preventing cross-provider substring collisions (e.g., "gpt-5.2-chat-2025-12-11" incorrectly matching Google's "chat-" pattern instead of OpenAI's "gpt-5.2"). Adds early exit when key length equals model name length. Reorders aggregateModels spreads so OpenAI is last (preferred on same-length ties). Removes deprecated PaLM2/Codey entries from googleModels. * refactor: re-order models based on more likely usage * refactor: Improve key matching logic in findMatchingPattern Updated the findMatchingPattern function to enhance key matching by ensuring case-insensitive comparisons and maintaining the longest match priority. Clarified comments regarding key ordering and performance implications, emphasizing the importance of defining older models first for efficiency and the handling of same-length ties. This refactor aims to improve code clarity and maintainability. * test: Enhance findMatchingPattern tests for edge cases and performance Added new test cases to the findMatchingPattern function, covering scenarios such as empty model names, case-insensitive matching, and performance optimizations. Included checks for longest match priority and ensured deprecated PaLM2/Codey models are no longer present in token entries. This update aims to improve test coverage and validate the function's behavior under various conditions. * test: Update findMatchingPattern test to use last key for exact match validation Modified the test for findMatchingPattern to utilize the last key from the openAIMap for exact match checks, ensuring the test accurately reflects the expected behavior of the function. This change enhances the clarity and reliability of the test case.	2026-03-04 19:34:13 -05:00
Danny Avila	d3622844ad	💰 feat: Add gpt-5.3 context window and pricing (#12049 ) * 💰 feat: Add gpt-5.3 context window and pricing * 💰 feat: Add OpenAI cached input pricing and `gpt-5.2-pro` model - Add cached input pricing (write/read) for gpt-4o, gpt-4.1, gpt-5.x, o1, o3, o4-mini models with correct per-family discount tiers - Add gpt-5.2-pro pricing ($21/$168), context window, and max output - Pro models (gpt-5-pro, gpt-5.2-pro) correctly excluded from cache pricing as OpenAI does not support caching for these * 🔍 fix: Address review findings for OpenAI pricing - Add o1-preview to cacheTokenValues (50% discount, same as o1) - Fix comment to enumerate all models per discount tier - Add cache tests for dated variants (gpt-4o-2024-08-06, etc.) - Add gpt-5-mini/gpt-5-nano to 10% ratio invariant test - Replace forEach with for...of in new test code - Fix inconsistent test description phrasing - Add gpt-5.3-preview to context window tests	2026-03-03 20:44:05 -05:00
Danny Avila	d3c06052d7	🗝️ feat: Credential Variables for DB-Sourced MCP Servers (#12044 ) * feat: Allow Credential Variables in Headers for DB-sourced MCP Servers - Removed the hasCustomUserVars check from ToolService.js, directly retrieving userMCPAuthMap. - Updated MCPConnectionFactory and related classes to include a dbSourced flag for better handling of database-sourced configurations. - Added integration tests to ensure proper behavior of dbSourced servers, verifying that sensitive placeholders are not resolved while allowing customUserVars. - Adjusted various MCP-related files to accommodate the new dbSourced logic, ensuring consistent handling across the codebase. * chore: MCPConnectionFactory Tests with Additional Flow Metadata for typing - Updated MCPConnectionFactory tests to include new fields in flowMetadata: serverUrl and state. - Enhanced mockFlowData in multiple test cases to reflect the updated structure, ensuring comprehensive coverage of the OAuth flow scenarios. - Added authorization_endpoint to metadata in the test setup for improved validation of the OAuth process. * refactor: Simplify MCPManager Configuration Handling - Removed unnecessary type assertions and streamlined the retrieval of server configuration in MCPManager. - Enhanced the handling of OAuth and database-sourced flags for improved clarity and efficiency. - Updated tests to reflect changes in user object structure and ensure proper processing of MCP environment variables. * refactor: Optimize User MCP Auth Map Retrieval in ToolService - Introduced conditional loading of userMCPAuthMap based on the presence of MCP-delimited tools, improving efficiency by avoiding unnecessary calls. - Updated the loadToolDefinitionsWrapper and loadAgentTools functions to reflect this change, enhancing overall performance and clarity. * test: Add userMCPAuthMap gating tests in ToolService - Introduced new tests to validate the logic for determining if MCP tools are present in the agent's tool list. - Implemented various scenarios to ensure accurate detection of MCP tools, including edge cases for empty, undefined, and null tool lists. - Enhanced clarity and coverage of the ToolService capability checking logic. * refactor: Enhance MCP Environment Variable Processing - Simplified the handling of the dbSourced parameter in the processMCPEnv function. - Introduced a failsafe mechanism to derive dbSourced from options if not explicitly provided, improving robustness and clarity in MCP environment variable processing. * refactor: Update Regex Patterns for Credential Placeholders in ServerConfigsDB - Modified regex patterns to include additional credential/env placeholders that should not be allowed in user-provided configurations. - Clarified comments to emphasize the security risks associated with credential exfiltration when MCP servers are shared between users. * chore: field order * refactor: Clean Up dbSourced Parameter Handling in processMCPEnv - Reintroduced the failsafe mechanism for deriving the dbSourced parameter from options, ensuring clarity and robustness in MCP environment variable processing. - Enhanced code readability by maintaining consistent comment structure. * refactor: Update MCPOptions Type to Include Optional dbId - Modified the processMCPEnv function to extend the MCPOptions type, allowing for an optional dbId property. - Simplified the logic for deriving the dbSourced parameter by directly checking the dbId property, enhancing code clarity and maintainability.	2026-03-03 18:02:37 -05:00
Danny Avila	a2a09b556a	🤖 feat: `gemini-3.1-flash-lite-preview` Window & Pricing (#12043 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details * 🤖 feat: `gemini-3.1-flash-lite-preview` Window & Pricing - Updated `.env.example` to include `gemini-3.1-flash-lite-preview` in the list of available models. - Enhanced `tx.js` to define token values for `gemini-3.1-flash-lite`. - Adjusted `tokens.ts` to allocate input tokens for `gemini-3.1-flash-lite`. - Modified `config.ts` to include `gemini-3.1-flash-lite-preview` in the default models list. * chore: testing for `gemini-3.1-flash-lite` model, comments - Updated `tx.js` to include cache token values for `gemini-3.1-flash-lite` with specific write and read rates. - Enhanced `tx.spec.js` to include tests for the new `gemini-3.1-flash-lite-preview` model, ensuring correct rate retrieval for both prompt and completion token types.	2026-03-03 13:47:16 -05:00
Danny Avila	5be90706b0	✂️ fix: Unicode-Safe Title Truncation and Shared View Layout Polish (#12003 ) * fix: title sanitization with max length truncation and update ShareView for better text display - Added functionality to `sanitizeTitle` to truncate titles exceeding 200 characters with an ellipsis, ensuring consistent title length. - Updated `ShareView` component to apply a line clamp on the title, improving text display and preventing overflow in the UI. * refactor: Update layout and styling in MessagesView and ShareView components - Removed unnecessary padding in MessagesView to streamline the layout. - Increased bottom padding in the message container for better spacing. - Enhanced ShareView footer positioning and styling for improved visibility. - Adjusted section and div classes in ShareView for better responsiveness and visual consistency. * fix: Correct title fallback and enhance sanitization logic in sanitizeTitle - Updated the fallback title in sanitizeTitle to use DEFAULT_TITLE_FALLBACK instead of a hardcoded string. - Improved title truncation logic to ensure proper handling of maximum length and whitespace, including edge cases for emoji and whitespace-only titles. - Added tests to validate the new sanitization behavior, ensuring consistent and expected results across various input scenarios.	2026-03-01 16:44:57 -05:00
marbence101	3a079b980a	📌 fix: Populate userMessage.files Before First DB Save (#11939 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details * fix: populate userMessage.files before first DB save * fix: ESLint error fixed * fix: deduplicate file-population logic and add test coverage Extract `buildMessageFiles` helper into `packages/api/src/utils/message` to replace three near-identical loops in BaseClient and both agent controllers. Fixes set poisoning from undefined file_id entries, moves file population inside the skipSaveUserMessage guard to avoid wasted work, and adds full unit test coverage for the new behavior. * chore: reorder import statements in openIdJwtStrategy.js for consistency --------- Co-authored-by: Danny Avila <danny@librechat.ai>	2026-02-26 09:16:45 -05:00
Danny Avila	a0f9782e60	🪣 fix: Prevent Memory Retention from AsyncLocalStorage Context Propagation (#11942 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details * fix: store hide_sequential_outputs before processStream clears config processStream now clears config.configurable after completion to break memory retention chains. Save hide_sequential_outputs to a local variable before calling runAgents so the post-stream filter still works. * feat: memory diagnostics * chore: expose garbage collection in backend inspect command Updated the backend inspect command in package.json to include the --expose-gc flag, enabling garbage collection diagnostics for improved memory management during development. * chore: update @librechat/agents dependency to version 3.1.52 Bumped the version of @librechat/agents in package.json and package-lock.json to ensure compatibility and access to the latest features and fixes. * fix: clear heavy config state after processStream to prevent memory leaks Break the reference chain from LangGraph's internal __pregel_scratchpad through @langchain/core RunTree.extra[lc:child_config] into the AsyncLocalStorage context captured by timers and I/O handles. After stream completion, null out symbol-keyed scratchpad properties (currentTaskInput), config.configurable, and callbacks. Also call Graph.clearHeavyState() to release config, signal, content maps, handler registry, and tool sessions. * chore: fix imports for memory utils * chore: add circular dependency check in API build step Enhanced the backend review workflow to include a check for circular dependencies during the API build process. If a circular dependency is detected, an error message is displayed, and the process exits with a failure status. * chore: update API build step to include circular dependency detection Modified the backend review workflow to rename the API package installation step to reflect its new functionality, which now includes detection of circular dependencies during the build process. * chore: add memory diagnostics option to .env.example Included a commented-out configuration option for enabling memory diagnostics in the .env.example file, which logs heap and RSS snapshots every 60 seconds when activated. * chore: remove redundant agentContexts cleanup in disposeClient function Streamlined the disposeClient function by eliminating duplicate cleanup logic for agentContexts, ensuring efficient memory management during client disposal. * refactor: move runOutsideTracing utility to utils and update its usage Refactored the runOutsideTracing function by relocating it to the utils module for better organization. Updated the tool execution handler to utilize the new import, ensuring consistent tracing behavior during tool execution. * refactor: enhance connection management and diagnostics Added a method to ConnectionsRepository for retrieving the active connection count. Updated UserConnectionManager to utilize this new method for app connection count reporting. Refined the OAuthReconnectionTracker's getStats method to improve clarity in diagnostics. Introduced a new tracing utility in the utils module to streamline tracing context management. Additionally, added a safeguard in memory diagnostics to prevent unnecessary snapshot collection for very short intervals. * refactor: enhance tracing utility and add memory diagnostics tests Refactored the runOutsideTracing function to improve warning logic when the AsyncLocalStorage context is missing. Added tests for memory diagnostics and tracing utilities to ensure proper functionality and error handling. Introduced a new test suite for memory diagnostics, covering snapshot collection and garbage collection behavior.	2026-02-25 17:41:23 -05:00
Danny Avila	7a1d2969b8	🤖 feat: Gemini 3.1 Pricing and Context Window (#11884 ) - Added support for the new Gemini 3.1 models, including 'gemini-3.1-pro-preview' and 'gemini-3.1-pro-preview-customtools'. - Updated pricing logic to apply standard and premium rates based on token usage thresholds for the new models. - Enhanced tests to validate pricing behavior for both standard and premium scenarios. - Modified configuration files to include Gemini 3.1 models in the default model lists and token value mappings. - Updated environment example file to reflect the new model options.	2026-02-20 16:21:32 -05:00
Danny Avila	0697e8cd60	🤖 feat: Claude Sonnet 4.6 support (#11829 ) Some checks are pending Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions Details * 🤖 feat: Claude Sonnet 4.6 support - Updated .env.example to include claude-sonnet-4-6 in the list of available models. - Enhanced token value assignments in api/models/tx.js and packages/api/src/utils/tokens.ts to accommodate claude-sonnet-4-6. - Added tests in packages/data-provider/specs/bedrock.spec.ts to verify support for claude-sonnet-4-6 in adaptive thinking and context-1m functionalities. - Modified bedrock.ts to correctly parse and identify the version of claude-sonnet-4-6 for adaptive thinking checks. - Included claude-sonnet-4-6 in sharedAnthropicModels and bedrockModels for consistent model availability. * chore: additional Claude Sonnet 4.6 tests - Added unit tests for Claude Sonnet 4.6 in `tokens.spec.js` to verify context length and max output tokens. - Updated `helpers.ts` documentation to reflect adaptive thinking support for Sonnet 4.6. - Enhanced `llm.spec.ts` with tests for context headers and adaptive thinking configurations for Claude Sonnet 4.6. - Improved `bedrock.spec.ts` to ensure correct parsing and handling of Claude Sonnet 4.6 model variations with adaptive thinking.	2026-02-17 15:24:03 -05:00
Danny Avila	12f45c76ee	🎮 feat: Bedrock Parameters for OpenAI GPT-OSS models (#11798 ) Some checks failed Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Has been cancelled Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Has been cancelled Details Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Has been cancelled Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Has been cancelled Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Has been cancelled Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Has been cancelled Details Add OpenAI as a Bedrock provider so that selecting openai.gpt-oss-* models in the Bedrock agent UI renders the general parameter settings (temperature, top_p, max_tokens) instead of a blank panel. Also add token context lengths (128K) for gpt-oss-20b and gpt-oss-120b.	2026-02-14 14:10:32 -05:00
Danny Avila	276ac8d011	🛰️ feat: Add Bedrock Parameter Settings for MoonshotAI and Z.AI Models (#11783 ) - Introduced new model entries for 'moonshotai.kimi' and 'moonshotai.kimi-k2.5' in tokens.ts. - Updated parameterSettings.ts to include configurations for MoonshotAI and ZAI providers. - Enhanced schemas.ts by adding MoonshotAI and ZAI to the BedrockProviders enum for better integration.	2026-02-13 11:21:53 -05:00
Jón Levy	dc89e00039	🪙 refactor: Distinguish ID Tokens from Access Tokens in OIDC Federated Auth (#11711 ) * fix(openid): distinguish ID tokens from access tokens in federated auth Fix OpenID Connect token handling to properly distinguish ID tokens from access tokens. ID tokens and access tokens are now stored and propagated separately, preventing token placeholders from resolving to identical values. - AuthService.js: Added idToken field to session storage - openIdJwtStrategy.js: Updated to read idToken from session - openidStrategy.js: Explicitly included id_token in federatedTokens - Test suites: Added comprehensive test coverage for token distinction Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(openid): add separate openid_id_token cookie for ID token storage Store the OIDC ID token in its own cookie rather than relying solely on the access token, ensuring correct token type is used for identity verification vs API authorization. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(openid): add JWT strategy cookie fallback tests Cover the token source resolution logic in openIdJwtStrategy: session-only, cookie-only, partial session fallback, raw Bearer fallback, and distinct id_token/access_token from cookies. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 11:07:39 -05:00
Danny Avila	41e2348d47	🤖 feat: Claude Opus 4.6 - 1M Context, Premium Pricing, Adaptive Thinking (#11670 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions Details * feat: Implement new features for Claude Opus 4.6 model - Added support for tiered pricing based on input token count for the Claude Opus 4.6 model. - Updated token value calculations to include inputTokenCount for accurate pricing. - Enhanced transaction handling to apply premium rates when input tokens exceed defined thresholds. - Introduced comprehensive tests to validate pricing logic for both standard and premium rates across various scenarios. - Updated related utility functions and models to accommodate new pricing structure. This change improves the flexibility and accuracy of token pricing for the Claude Opus 4.6 model, ensuring users are charged appropriately based on their usage. * feat: Add effort field to conversation and preset schemas - Introduced a new optional `effort` field of type `String` in both the `IPreset` and `IConversation` interfaces. - Updated the `conversationPreset` schema to include the `effort` field, enhancing the data structure for better context management. * chore: Clean up unused variable and comments in initialize function * chore: update dependencies and SDK versions - Updated @anthropic-ai/sdk to version 0.73.0 in package.json and overrides. - Updated @anthropic-ai/vertex-sdk to version 0.14.3 in packages/api/package.json. - Updated @librechat/agents to version 3.1.34 in packages/api/package.json. - Refactored imports in packages/api/src/endpoints/anthropic/vertex.ts for consistency. * chore: remove postcss-loader from dependencies * feat: Bedrock model support for adaptive thinking configuration - Updated .env.example to include new Bedrock model IDs for Claude Opus 4.6. - Refactored bedrockInputParser to support adaptive thinking for Opus models, allowing for dynamic thinking configurations. - Introduced a new function to check model compatibility with adaptive thinking. - Added an optional `effort` field to the input schemas and updated related configurations. - Enhanced tests to validate the new adaptive thinking logic and model configurations. * feat: Add tests for Opus 4.6 adaptive thinking configuration * feat: Update model references for Opus 4.6 by removing version suffix * feat: Update @librechat/agents to version 3.1.35 in package.json and package-lock.json * chore: @librechat/agents to version 3.1.36 in package.json and package-lock.json * feat: Normalize inputTokenCount for spendTokens and enhance transaction handling - Introduced normalization for promptTokens to ensure inputTokenCount does not go negative. - Updated transaction logic to reflect normalized inputTokenCount in pricing calculations. - Added comprehensive tests to validate the new normalization logic and its impact on transaction rates for both standard and premium models. - Refactored related functions to improve clarity and maintainability of token value calculations. * chore: Simplify adaptive thinking configuration in helpers.ts - Removed unnecessary type casting for the thinking property in updatedOptions. - Ensured that adaptive thinking is directly assigned when conditions are met, improving code clarity. * refactor: Replace hard-coded token values with dynamic retrieval from maxTokensMap in model tests * fix: Ensure non-negative token values in spendTokens calculations - Updated token value retrieval to use Math.max for prompt and completion tokens, preventing negative values. - Enhanced clarity in token calculations for both prompt and completion transactions. * test: Add test for normalization of negative structured token values in spendStructuredTokens - Implemented a test to ensure that negative structured token values are normalized to zero during token spending. - Verified that the transaction rates remain consistent with the expected standard values after normalization. * refactor: Bedrock model support for adaptive thinking and context handling - Added tests for various alternate naming conventions of Claude models to validate adaptive thinking and context support. - Refactored `supportsAdaptiveThinking` and `supportsContext1m` functions to utilize new parsing methods for model version extraction. - Updated `bedrockInputParser` to handle effort configurations more effectively and strip unnecessary fields for non-adaptive models. - Improved handling of anthropic model configurations in the input parser. * fix: Improve token value retrieval in getMultiplier function - Updated the token value retrieval logic to use optional chaining for better safety against undefined values. - Added a test case to ensure that the function returns the default rate when the provided valueKey does not exist in tokenValues.	2026-02-06 18:35:36 -05:00
Danny Avila	f34052c6bb	🌙 feat: Moonshot Provider Support (#11621 ) * ✨ feat: Add Moonshot Provider Support - Updated the `isKnownCustomProvider` function to include `Providers.MOONSHOT` in the list of recognized custom providers. - Enhanced the `providerConfigMap` to initialize `MOONSHOT` with the custom initialization function. - Introduced `MoonshotIcon` component for visual representation in the UI, integrated into the `UnknownIcon` component. - Updated various files across the API and client to support the new `MOONSHOT` provider, including configuration and response handling. This update expands the capabilities of the application by integrating support for the Moonshot provider, enhancing both backend and frontend functionalities. * ✨ feat: Add Moonshot/Kimi Model Pricing and Tests - Introduced new pricing configurations for Moonshot and Kimi models in `tx.js`, including various model variations and their respective prompt and completion values. - Expanded unit tests in `tx.spec.js` and `tokens.spec.js` to validate pricing and token limits for the newly added Moonshot/Kimi models, ensuring accurate calculations and handling of model variations. - Updated utility functions to support the new model structures and ensure compatibility with existing functionalities. This update enhances the pricing model capabilities and improves test coverage for the Moonshot/Kimi integration. * ✨ feat: Enhance Token Pricing Documentation and Configuration - Added comprehensive documentation for token pricing configuration in `tx.js` and `tokens.ts`, emphasizing the importance of key ordering for pattern matching. - Clarified the process for defining base and specific patterns to ensure accurate pricing retrieval based on model names. - Improved code comments to guide future additions of model families, enhancing maintainability and understanding of the pricing structure. This update improves the clarity and usability of the token pricing configuration, facilitating better integration and future enhancements. * chore: import order * chore: linting	2026-02-04 10:53:57 +01:00
Max Sanna	dd4bbd38fc	🪪 feat: Microsoft Graph Access Token Placeholder for MCP Servers (#10867 ) * feat: MCP Graph Token env var * Addressing copilot remarks * Addressed Copilot review remarks * Fixed graphtokenservice mock in MCP test suite * fix: remove unnecessary type check and cast in resolveGraphTokensInRecord * ci: add Graph Token integration tests in MCPManager * refactor: update user type definitions to use Partial<IUser> in multiple functions * test: enhance MCP tests for graph token processing and user placeholder resolution - Added comprehensive tests to validate the interaction between preProcessGraphTokens and processMCPEnv. - Ensured correct resolution of graph tokens and user placeholders in various configurations. - Mocked OIDC utilities to facilitate testing of token extraction and validation. - Verified that original options remain unchanged after processing. * chore: import order * chore: imports --------- Co-authored-by: Danny Avila <danny@librechat.ai>	2026-01-28 17:44:33 -05:00
Danny Avila	75c02a1a18	🗂️ feat: Better Persistence for Code Execution Files Between Sessions (#11362 ) * refactor: process code output files for re-use (WIP) * feat: file attachment handling with additional metadata for downloads * refactor: Update directory path logic for local file saving based on basePath * refactor: file attachment handling to support TFile type and improve data merging logic * feat: thread filtering of code-generated files - Introduced parentMessageId parameter in addedConvo and initialize functions to enhance thread management. - Updated related methods to utilize parentMessageId for retrieving messages and filtering code-generated files by conversation threads. - Enhanced type definitions to include parentMessageId in relevant interfaces for better clarity and usage. * chore: imports/params ordering * feat: update file model to use messageId for filtering and processing - Changed references from 'message' to 'messageId' in file-related methods for consistency. - Added messageId field to the file schema and updated related types. - Enhanced file processing logic to accommodate the new messageId structure. * feat: enhance file retrieval methods to support user-uploaded execute_code files - Added a new method `getUserCodeFiles` to retrieve user-uploaded execute_code files, excluding code-generated files. - Updated existing file retrieval methods to improve filtering logic and handle edge cases. - Enhanced thread data extraction to collect both message IDs and file IDs efficiently. - Integrated `getUserCodeFiles` into relevant endpoints for better file management in conversations. * chore: update @librechat/agents package version to 3.0.78 in package-lock.json and related package.json files * refactor: file processing and retrieval logic - Added a fallback mechanism for download URLs when files exceed size limits or cannot be processed locally. - Implemented a deduplication strategy for code-generated files based on conversationId and filename to optimize storage. - Updated file retrieval methods to ensure proper filtering by messageIds, preventing orphaned files from being included. - Introduced comprehensive tests for new thread data extraction functionality, covering edge cases and performance considerations. * fix: improve file retrieval tests and handling of optional properties - Updated tests to safely access optional properties using non-null assertions. - Modified test descriptions for clarity regarding the exclusion of execute_code files. - Ensured that the retrieval logic correctly reflects the expected outcomes for file queries. * test: add comprehensive unit tests for processCodeOutput functionality - Introduced a new test suite for the processCodeOutput function, covering various scenarios including file retrieval, creation, and processing for both image and non-image files. - Implemented mocks for dependencies such as axios, logger, and file models to isolate tests and ensure reliable outcomes. - Validated behavior for existing files, new file creation, and error handling, including size limits and fallback mechanisms. - Enhanced test coverage for metadata handling and usage increment logic, ensuring robust verification of file processing outcomes. * test: enhance file size limit enforcement in processCodeOutput tests - Introduced a configurable file size limit for tests to improve flexibility and coverage. - Mocked the `librechat-data-provider` to allow dynamic adjustment of file size limits during tests. - Updated the file size limit enforcement test to validate behavior when files exceed specified limits, ensuring proper fallback to download URLs. - Reset file size limit after tests to maintain isolation for subsequent test cases.	2026-01-28 17:44:32 -05:00
kenzaelk98	191cd3983c	🛂 fix: Encode Non-ASCII Characters in MCP Server Headers (#11432 ) Fixes ByteString conversion errors when user names contain Unicode characters > 255 (e.g., ć, đ, ł, š, ž) in MCP server headers. - Add encodeHeaderValue() function to Base64 encode extended Unicode - Update processUserPlaceholders() to encode name/username/email in headers - Update processSingleValue() with isHeader parameter - Apply encoding in processMCPEnv() and resolveHeaders() Tested locally with MCP server using user name 'Đorđe' (contains đ=272). Headers are correctly encoded as base64, preventing ByteString errors. Co-authored-by: kenzaelk98 <kenzaelk98@leoninestudios.com> Co-authored-by: heptapod <164861708+leondape@users.noreply.github.com>	2026-01-21 14:00:25 -05:00
Joseph Licata	200098d992	🍌 feat: Gemini Image Generation Tool (Nano Banana) (#10676 ) * Added fully functioning Agent Tool supporting Google's Nano Banana * 🔧 refactor: Update Google credentials handling in GeminiImageGen.js * Refactored the credentials path to follow a consistent pattern with other Google service integrations, allowing for an environment variable override. * Updated documentation in README-GeminiNanoBanana.md to reflect the new credentials handling approach and removed references to hardcoded paths. * 🛠️ refactor: Remove unnecessary whitespace in handleTools.js * 🔧 feat: Update Gemini Image Generation Tool - Bump @google/genai package version to ^1.19.0 for improved functionality. - Refactor GeminiImageGen to createGeminiImageTool for better clarity and consistency. - Enhance manifest.json for Gemini Image Tools with updated descriptions and icon. - Add SVG icon for Gemini Image Tools. - Implement progress tracking for Gemini image generation in the UI. - Introduce new toolkit and context handling for image generation tools. This update improves the Gemini image generation capabilities and user experience. * 🗑️ chore: Remove outdated Gemini image generation PNG and update SVG icon - Deleted the obsolete PNG file for Gemini image generation. - Updated the SVG icon with a new design featuring a gradient and shadow effect, enhancing visual appeal and consistency. * fix: ESLint formatting and unused variable in GeminiImageGen * fix: Update default model to gemini-2.5-flash-image * ✨ feat: Enhance Gemini Image Generation Configuration - Updated .env.example to include new environment variables for Google Cloud region, service account configuration, and Gemini API key options. - Modified GeminiImageGen.js to support both user-provided API keys and Vertex AI service accounts, improving flexibility in client initialization. - Updated manifest.json to reflect changes in authentication methods for the Gemini Image Tools. - Bumped @google/genai package version to 1.19.0 in package-lock.json for compatibility with new features. * 🔧 fix: Format Default Service Key Path in GeminiImageGen.js - Adjusted the return statement in getDefaultServiceKeyPath function for improved readability by formatting it across multiple lines. This change enhances code clarity without altering functionality. * ✨ feat: Enhance Gemini Image Generation with Token Usage Tracking - Added `recordTokenUsage` function to track token usage for balance management. - Integrated token recording into the image generation process. - Updated Gemini image generation tool to accept optional `aspectRatio` and `imageSize` parameters for improved image customization. - Updated token values for new Gemini models in the transaction model. - Improved documentation for image generation tool descriptions and parameters. * ✨ feat: Add new Gemini models for image generation token limits - Introduced token limits for 'gemini-3-pro-image' and 'gemini-2.5-flash-image' models. - Updated token values to enhance the Gemini image generation capabilities. * 🔧 fix: Update Google Service Key Path for Consistency in Initialization (#11001) * 🔧 refactor: Update GeminiImageGen for improved file handling and path resolution - Changed the default service key path to use process.cwd() for better compatibility. - Replaced synchronous file system operations with asynchronous promises for mkdir and writeFile, enhancing performance and error handling. - Added error handling for credential file access to prevent crashes when the file does not exist. * 🔧 refactor: Update GeminiImageGen to streamline API key handling - Refactored API key checks to improve clarity and consistency. - Removed redundant checks for user-provided keys, enhancing code readability. - Ensured proper logging for API key usage across different configurations. * 🔧 fix: Update GeminiImageGen to handle imageSize support conditionally - Added a check to ensure imageSize is only applied if the gemini model does not include 'gemini-2.5-flash-image', improving compatibility. - Enhanced the logic for setting imageConfig to prevent potential issues with unsupported configurations. * 🔧 refactor: Simplify local storage condition in createGeminiImageTool function * 🔧 feat: Enhance image format handling in GeminiImageGen with conversion support * 🔧 refactor: Streamline API key initialization in GeminiImageGen - Simplified the handling of API keys by removing redundant checks for user-provided keys. - Updated logging to reflect the new priority order for API key usage, enhancing clarity and consistency. - Improved code readability by consolidating key retrieval logic. --------- Co-authored-by: Dev Bhanushali <dev.bhanushali@hingehealth.com> Co-authored-by: Danny Avila <danny@librechat.ai>	2026-01-03 11:26:46 -05:00
Danny Avila	d7a765ac4c	🪙 feat: Update GPT-5.1 and GPT-5.2 Token Pricing (#11101 )	2025-12-25 16:08:49 -05:00
Danny Avila	6ffb176056	🧮 refactor: Replace Eval with Safe Math Expression Parser (#11098 ) * chore: Add mathjs dependency * refactor: Replace eval with mathjs for safer expression evaluation and improve session expiry handling to not environment variables from data-schemas package * test: Add integration tests for math function with environment variable expressions * refactor: Update test description for clarity on expiresIn behavior * refactor: Update test cases to clarify default expiration behavior for token generation * refactor: Improve error handling in math function for clearer evaluation errors	2025-12-25 12:25:41 -05:00
Danny Avila	f9060fa25f	🔧 chore: Update ESLint Config & Run Linter (#10986 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details	2025-12-15 17:55:25 -05:00
Atef Bellaaj	e15d37b399	🔐 feat: Add API key authentication support for MCP servers (#10936 ) * 🔐 feat: Add API key authentication support for MCP servers * Chore: Copilot comments fixes --------- Co-authored-by: Atef Bellaaj <slalom.bellaaj@external.daimlertruck.com>	2025-12-12 13:51:49 -05:00
Danny Avila	04a4a2aa44	🧵 refactor: Migrate Endpoint Initialization to TypeScript (#10794 ) * refactor: move endpoint initialization methods to typescript * refactor: move agent init to packages/api - Introduced `initialize.ts` for agent initialization, including file processing and tool loading. - Updated `resources.ts` to allow optional appConfig parameter. - Enhanced endpoint configuration handling in various initialization files to support model parameters. - Added new artifacts and prompts for React component generation. - Refactored existing code to improve type safety and maintainability. * refactor: streamline endpoint initialization and enhance type safety - Updated initialization functions across various endpoints to use a consistent request structure, replacing `unknown` types with `ServerResponse`. - Simplified request handling by directly extracting keys from the request body. - Improved type safety by ensuring user IDs are safely accessed with optional chaining. - Removed unnecessary parameters and streamlined model options handling for better clarity and maintainability. * refactor: moved ModelService and extractBaseURL to packages/api - Added comprehensive tests for the models fetching functionality, covering scenarios for OpenAI, Anthropic, Google, and Ollama models. - Updated existing endpoint index to include the new models module. - Enhanced utility functions for URL extraction and model data processing. - Improved type safety and error handling across the models fetching logic. * refactor: consolidate utility functions and remove unused files - Merged `deriveBaseURL` and `extractBaseURL` into the `@librechat/api` module for better organization. - Removed redundant utility files and their associated tests to streamline the codebase. - Updated imports across various client files to utilize the new consolidated functions. - Enhanced overall maintainability by reducing the number of utility modules. * refactor: replace ModelService references with direct imports from @librechat/api and remove ModelService file * refactor: move encrypt/decrypt methods and key db methods to data-schemas, use `getProviderConfig` from `@librechat/api` * chore: remove unused 'res' from options in AgentClient * refactor: file model imports and methods - Updated imports in various controllers and services to use the unified file model from '~/models' instead of '~/models/File'. - Consolidated file-related methods into a new file methods module in the data-schemas package. - Added comprehensive tests for file methods including creation, retrieval, updating, and deletion. - Enhanced the initializeAgent function to accept dependency injection for file-related methods. - Improved error handling and logging in file methods. * refactor: streamline database method references in agent initialization * refactor: enhance file method tests and update type references to IMongoFile * refactor: consolidate database method imports in agent client and initialization * chore: remove redundant import of initializeAgent from @librechat/api * refactor: move checkUserKeyExpiry utility to @librechat/api and update references across endpoints * refactor: move updateUserPlugins logic to user.ts and simplify UserController * refactor: update imports for user key management and remove UserService * refactor: remove unused Anthropics and Bedrock endpoint files and clean up imports * refactor: consolidate and update encryption imports across various files to use @librechat/data-schemas * chore: update file model mock to use unified import from '~/models' * chore: import order * refactor: remove migrated to TS agent.js file and its associated logic from the endpoints * chore: add reusable function to extract imports from source code in unused-packages workflow * chore: enhance unused-packages workflow to include @librechat/api dependencies and improve dependency extraction * chore: improve dependency extraction in unused-packages workflow with enhanced error handling and debugging output * chore: add detailed debugging output to unused-packages workflow for better visibility into unused dependencies and exclusion lists * chore: refine subpath handling in unused-packages workflow to correctly process scoped and non-scoped package imports * chore: clean up unused debug output in unused-packages workflow and reorganize type imports in initialize.ts	2025-12-11 16:37:16 -05:00
Danny Avila	2d536dd0fa	📦 refactor: Request Message Sanitization for Smaller Final Response (#10792 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions Details * refactor: implement sanitizeFileForTransmit and sanitizeMessageForTransmit functions for smaller payload to client transmission * refactor: enhance sanitizeMessageForTransmit to preserve empty files array and avoid mutating original message * refactor: update sanitizeMessageForTransmit to ensure immutability of files array and improve test clarity	2025-12-03 14:26:49 -05:00
Danny Avila	8bdc808074	⚡ refactor: Optimize & Standardize Tokenizer Usage (#10777 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions Details * refactor: Token Limit Processing with Enhanced Efficiency - Added a new test suite for `processTextWithTokenLimit`, ensuring comprehensive coverage of various scenarios including under, at, and exceeding token limits. - Refactored the `processTextWithTokenLimit` function to utilize a ratio-based estimation method, significantly reducing the number of token counting function calls compared to the previous binary search approach. - Improved handling of edge cases and variable token density, ensuring accurate truncation and performance across diverse text inputs. - Included direct comparisons with the old implementation to validate correctness and efficiency improvements. * refactor: Remove Tokenizer Route and Related References - Deleted the tokenizer route from the server and removed its references from the routes index and server files, streamlining the API structure. - This change simplifies the routing configuration by eliminating unused endpoints. * refactor: Migrate countTokens Utility to API Module - Removed the local countTokens utility and integrated it into the @librechat/api module for centralized access. - Updated various files to reference the new countTokens import from the API module, ensuring consistent usage across the application. - Cleaned up unused references and imports related to the previous countTokens implementation. * refactor: Centralize escapeRegExp Utility in API Module - Moved the escapeRegExp function from local utility files to the @librechat/api module for consistent usage across the application. - Updated imports in various files to reference the new centralized escapeRegExp function, ensuring cleaner code and reducing redundancy. - Removed duplicate implementations of escapeRegExp from multiple files, streamlining the codebase. * refactor: Enhance Token Counting Flexibility in Text Processing - Updated the `processTextWithTokenLimit` function to accept both synchronous and asynchronous token counting functions, improving its versatility. - Introduced a new `TokenCountFn` type to define the token counting function signature. - Added comprehensive tests to validate the behavior of `processTextWithTokenLimit` with both sync and async token counting functions, ensuring consistent results. - Implemented a wrapper to track call counts for the `countTokens` function, optimizing performance and reducing unnecessary calls. - Enhanced existing tests to compare the performance of the new implementation against the old one, demonstrating significant improvements in efficiency. * chore: documentation for Truncation Safety Buffer in Token Processing - Added a safety buffer multiplier to the character position estimates during text truncation to prevent overshooting token limits. - Updated the `processTextWithTokenLimit` function to utilize the new `TRUNCATION_SAFETY_BUFFER` constant, enhancing the accuracy of token limit processing. - Improved documentation to clarify the rationale behind the buffer and its impact on performance and efficiency in token counting.	2025-12-02 12:22:04 -05:00
Danny Avila	4202db1c99	🤖 feat: Tool Calling Support for DeepSeek V3.2 + OpenRouter Reasoning (#10752 ) Some checks are pending Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions Details * 🔧 chore: Update @librechat/agents to version 3.0.35 * ✨ feat: Add DeepSeek Model Pricing and Token Handling - Introduced pricing and token limits for 'deepseek-chat' and 'deepseek-reasoner' models, including prompt and completion rates. - Enhanced tests to validate pricing and token limits for DeepSeek models, ensuring correct handling of model variations and provider prefixes. - Updated cache multipliers for DeepSeek models to reflect new pricing structure. - Improved max output token handling for DeepSeek models, ensuring consistency across different endpoints.	2025-12-01 14:27:08 -05:00
Danny Avila	d7ce19e15a	🤖 feat: Latest Grok Model Pricing & Context Rates (#10727 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions Details * 🤖 feat: Latest Grok Model Pricing & Context Rates - Introduced 'grok-4-fast', 'grok-4-1-fast', and 'grok-code-fast' models with their respective prompt and completion rates. - Enhanced unit tests to validate prompt and completion rates for the new models, including variations with prefixes. - Updated token limits for the new models in the tokens utility, ensuring accurate handling in tests. * 🔧 refactor: Optimize JSON Export Logic in useExportConversation Hook Updated the export logic to create a Blob from the JSON string before downloading, improving compatibility and performance for file downloads. This change enhances the handling of deeply nested exports while maintaining the file size reduction achieved in previous updates.	2025-11-30 17:10:26 -05:00
Danny Avila	bdc65c5713	🪵 chore: Clean up Debug Logs in OpenID Token Extraction (#10687 ) Removed unnecessary debug logging statements in the extractOpenIDTokenInfo function to streamline the code and improve readability. This change enhances the clarity of the function's logic without altering its functionality.	2025-11-26 11:29:10 -05:00
Danny Avila	9211d59388	🤖 feat: Claude Opus 4.5 Token Rates and Window Limits (#10653 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions Details * 🤖 feat: Claude Opus 4.5 Token Rates and Window Limits - Introduced new model 'claude-opus-4-5' with defined prompt and completion values in tokenValues and cacheTokenValues. - Updated tests to validate prompt, completion, and cache rates for the new model. - Enhanced model name handling to accommodate variations for 'claude-opus-4-5' across different contexts. - Adjusted schemas to ensure correct max output token limits for the new model. * ci: Add tests for "prompt-caching" beta header in Claude Opus 4.5 models - Implemented tests to verify the addition of the "prompt-caching" beta header for the 'claude-opus-4-5' model and its variations. - Updated future-proofing logic to ensure correct max token limits for Claude 4.x and 5.x Opus models, adjusting defaults to 64K where applicable. - Enhanced existing tests to reflect changes in expected max token values for future Claude models. * chore: Remove redundant max output check for Anthropic settings - Eliminated the unnecessary check for ANTHROPIC_MAX_OUTPUT in the anthropicSettings schema, streamlining the logic for handling max output values.	2025-11-24 16:30:56 -05:00
Danny Avila	35319c1354	🔧 fix: Remove Bedrock Config Transform introduced in #9931 (#10628 ) Some checks failed Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Has been cancelled Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Has been cancelled Details Publish `@librechat/client` to NPM / build-and-publish (push) Has been cancelled Details Publish `librechat-data-provider` to NPM / build (push) Has been cancelled Details Publish `@librechat/data-schemas` to NPM / build-and-publish (push) Has been cancelled Details Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Has been cancelled Details Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Has been cancelled Details Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Has been cancelled Details Publish `librechat-data-provider` to NPM / publish-npm (push) Has been cancelled Details Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Has been cancelled Details * fix: Header and Environment Variable Handling Bug from #9931 * refactor: Remove warning log for missing tokens in extractOpenIDTokenInfo function * feat: Enhance resolveNestedObject function for improved placeholder processing - Added a new function `resolveNestedObject` to recursively process nested objects, replacing placeholders in string values while preserving the original structure. - Updated `createTestUser` to use `IUser` type and modified user ID generation. - Added comprehensive unit tests for `resolveNestedObject` to cover various scenarios, including nested structures, arrays, and custom user variables. - Improved type handling in `processMCPEnv` to ensure correct processing of mixed numeric and placeholder values. * refactor: Remove unnecessary manipulation of Bedrock options introduced in #9931 - Eliminated the resolveHeaders function call from the getOptions method in options.js, as it was no longer necessary for processing additional model request fields. - This change simplifies the code and improves maintainability.	2025-11-21 16:42:28 -05:00
Danny Avila	1814c81888	🕸️ fix: Minor Type Issues & Anthropic Web Search (#10618 ) Some checks are pending Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run Details Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run Details * fix: update @librechat/agents dependency to version 3.0.29 * chore: fix typing by replacing TUser with IUser * chore: import order * fix: replace TUser with IUser in run and OAuthReconnectionManager modules * fix: update @librechat/agents dependency to version 3.0.30	2025-11-21 14:25:05 -05:00
catmeme	7aa8d49f3a	🧭 fix: Add Base Path Support for Login/Register and Image Paths (#10116 ) * fix: add basePath pattern to support login/register and image paths * Fix linter errors * refactor: Update import statements for getBasePath and isEnabled, and add path utility functions with tests - Refactored imports in addImages.js and StableDiffusion.js to use getBasePath from '@librechat/api'. - Consolidated isEnabled and getBasePath imports in validateImageRequest.js. - Introduced new path utility functions in path.ts and corresponding unit tests in path.spec.ts to validate base path extraction logic. * fix: Update domain server base URL in MarkdownComponents and refactor authentication redirection logic - Changed the domain server base URL in MarkdownComponents.tsx to use the API base URL. - Refactored the useAuthRedirect hook to utilize React Router's navigate for redirection instead of window.location, ensuring a smoother SPA experience. - Added unit tests for the useAuthRedirect hook to verify authentication redirection behavior. * test: Mock isEnabled in validateImages.spec.js for improved test isolation - Updated validateImages.spec.js to mock the isEnabled function from @librechat/api, ensuring that tests can run independently of the actual implementation. - Cleared the DOMAIN_CLIENT environment variable before tests to avoid interference with basePath resolution. --------- Co-authored-by: Danny Avila <danny@librechat.ai>	2025-11-21 11:25:14 -05:00
Jón Levy	ef3bf0a932	🆔 feat: Add OpenID Connect Federated Provider Token Support (#9931 ) * feat: Add OpenID Connect federated provider token support Implements support for passing federated provider tokens (Cognito, Azure AD, Auth0) as variables in LibreChat's librechat.yaml configuration for both custom endpoints and MCP servers. Features: - New LIBRECHAT_OPENID_* template variables for federated provider tokens - JWT claims parsing from ID tokens without verification (for claim extraction) - Token validation with expiration checking - Support for multiple token storage locations (federatedTokens, openidTokens) - Integration with existing template variable system - Comprehensive test suite with Cognito-specific scenarios - Provider-agnostic design supporting Cognito, Azure AD, Auth0, etc. Security: - Server-side only token processing - Automatic token expiration validation - Graceful fallbacks for missing/invalid tokens - No client-side token exposure 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Add federated token propagation to OIDC authentication strategies Adds federatedTokens object to user during authentication to enable federated provider token template variables in LibreChat configuration. Changes: - OpenID JWT Strategy: Extract raw JWT from Authorization header and attach as federatedTokens.access_token to enable {{LIBRECHAT_OPENID_TOKEN}} placeholder resolution - OpenID Strategy: Attach tokenset tokens as federatedTokens object to standardize token access across both authentication strategies This enables proper token propagation for custom endpoints and MCP servers that require federated provider tokens for authorization. Resolves missing token issue reported by @ramden in PR #9931 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Denis Ramic <denis.ramic@nfon.com> Co-Authored-By: Claude <noreply@anthropic.com> * test: Add federatedTokens validation tests for OIDC strategies Adds comprehensive test coverage for the federated token propagation feature implemented in the authentication strategies. Tests added: - Verify federatedTokens object is attached to user with correct structure (access_token, refresh_token, expires_at) - Verify both tokenset and federatedTokens are present in user object - Ensure tokens from OIDC provider are correctly propagated Also fixes existing test suite by adding missing mocks: - isEmailDomainAllowed function mock - findOpenIDUser function mock These tests validate the fix from commit `5874ba29f` that enables {{LIBRECHAT_OPENID_TOKEN}} template variable functionality. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * docs: Remove implementation documentation file The PR description already contains all necessary implementation details. This documentation file is redundant and was requested to be removed. * fix: skip s256 check * fix(openid): handle missing refresh token in Cognito token refresh response When OPENID_REUSE_TOKENS=true, the token refresh flow was failing because Cognito (and most OAuth providers) don't return a new refresh token in the refresh grant response - they only return new access and ID tokens. Changes: - Modified setOpenIDAuthTokens() to accept optional existingRefreshToken parameter - Updated validation to only require access_token (refresh_token now optional) - Added logic to reuse existing refresh token when not provided in tokenset - Updated refreshController to pass original refresh token as fallback - Added comments explaining standard OAuth 2.0 refresh token behavior This fixes the "Token is not present. User is not authenticated." error that occurred during silent token refresh with Cognito as the OpenID provider. Fixes: Authentication loop with OPENID_REUSE_TOKENS=true and AWS Cognito * fix(openid): extract refresh token from cookies for template variable replacement When OPENID_REUSE_TOKENS=true, the openIdJwtStrategy populates user.federatedTokens to enable template variable replacement (e.g., {{LIBRECHAT_OPENID_ACCESS_TOKEN}}). However, the refresh_token field was incorrectly sourced from payload.refresh_token, which is always undefined because: 1. JWTs don't contain refresh tokens in their payload 2. The JWT itself IS the access token 3. Refresh tokens are separate opaque tokens stored in HTTP-only cookies This caused extractOpenIDTokenInfo() to receive incomplete federatedTokens, resulting in template variables remaining unreplaced in headers. Root Cause: - Line 90: `refresh_token: payload.refresh_token` (always undefined) - JWTs only contain access token data in their claims - Refresh tokens are separate, stored securely in cookies Solution: - Import `cookie` module to parse cookies from request - Extract refresh token from `refreshToken` cookie - Populate federatedTokens with both access token (JWT) and refresh token (from cookie) Impact: - Template variables like {{LIBRECHAT_OPENID_ACCESS_TOKEN}} now work correctly - Headers in librechat.yaml are properly replaced with actual tokens - MCP server authentication with federated tokens now functional Technical Details: - passReqToCallback=true in JWT strategy provides req object access - Refresh token extracted via cookies.parse(req.headers.cookie).refreshToken - Falls back gracefully if cookie header or refreshToken is missing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: re-resolve headers on each request to pick up fresh federatedTokens - OpenAIClient now re-resolves headers in chatCompletion() before each API call - This ensures template variables like {{LIBRECHAT_OPENID_TOKEN}} are replaced with actual token values from req.user.federatedTokens - initialize.js now stores original template headers instead of pre-resolved ones - Fixes template variable replacement when OPENID_REUSE_TOKENS=true The issue was that headers were only resolved once during client initialization, before openIdJwtStrategy had populated user.federatedTokens. Now headers are re-resolved on every request with the current user's fresh tokens. * debug: add logging to track header resolution in OpenAIClient * debug: log tokenset structure after refresh to diagnose missing access_token * fix: set federatedTokens on user object after OAuth refresh - After successful OAuth token refresh, the user object was not being updated with federatedTokens - This caused template variable resolution to fail on subsequent requests - Now sets user.federatedTokens with access_token, id_token, refresh_token and expires_at from the refreshed tokenset - Fixes template variables like {{LIBRECHAT_OPENID_TOKEN}} not being replaced after token refresh - Related to PR #9931 (OpenID federated token support) * fix(openid): pass user object through agent chain for template variable resolution Root cause: buildAgentContext in agents/run.ts called resolveHeaders without the user parameter, preventing OpenID federated token template variables from being resolved in agent runtime parameters. Changes: - packages/api/src/agents/run.ts: Add user parameter to createRun signature - packages/api/src/agents/run.ts: Pass user to resolveHeaders in buildAgentContext - api/server/controllers/agents/client.js: Pass user when calling createRun - api/server/services/Endpoints/bedrock/options.js: Add resolveHeaders call with debug logging - api/server/services/Endpoints/custom/initialize.js: Add debug logging - packages/api/src/utils/env.ts: Add comprehensive debug logging and stack traces - packages/api/src/utils/oidc.ts: Fix eslint errors (unused type, explicit any) This ensures template variables like {{LIBRECHAT_OPENID_TOKEN}} and {{LIBRECHAT_USER_OPENIDID}} are properly resolved in both custom endpoint headers and Bedrock AgentCore runtime parameters. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor: remove debug logging from OpenID token template feature Removed excessive debug logging that was added during development to make the PR more suitable for upstream review: - Removed 7 debug statements from OpenAIClient.js - Removed all console.log statements from packages/api/src/utils/env.ts - Removed debug logging from bedrock/options.js - Removed debug logging from custom/initialize.js - Removed debug statement from AuthController.js This reduces the changeset by ~50 lines while maintaining full functionality of the OpenID federated token template variable feature. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * test(openid): add comprehensive unit tests for template variable substitution - Add 34 unit tests for OIDC token utilities (oidc.spec.ts) - Test coverage for token extraction, validation, and placeholder processing - Integration tests for full OpenID token flow - All tests pass with comprehensive edge case coverage 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com> * test: fix OpenID federated tokens test failures - Add serverMetadata() mock to openid-client mock configuration * Fixes TypeError in openIdJwtStrategy.js where serverMetadata() was being called * Mock now returns jwks_uri and end_session_endpoint as expected by the code - Update outdated initialize.spec.js test * Remove test expecting resolveHeaders call during initialization * Header resolution was refactored to be deferred until LLM request time * Update test to verify options are returned correctly with useLegacyContent flag Fixes #9931 CI failures for backend unit tests 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * chore: fix package-lock.json conflict * chore: sync package-log with upstream * chore: cleanup * fix: use createSafeUser * fix: fix createSafeUser signature * chore: remove comments * chore: purge comments * fix: update Jest testPathPattern to testPathPatterns for Jest 30+ compatibility --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Denis Ramic <denis.ramic@nfon.com> Co-authored-by: kristjanaapro <kristjana@apro.is> chore: import order and add back JSDoc for OpenID JWT callback	2025-11-21 09:51:11 -05:00
Danny Avila	8b9afd5965	🤖 feat: Gemini 3 Support (#10584 ) * feat: Add support for model in token configurations and tests * chore: Update @librechat/agents to version 3.0.26 in package.json and package-lock.json	2025-11-19 15:05:37 -05:00
Danny Avila	cabc8afeac	🔧 fix: Await MCP Instructions and Filter Malformed Tool Calls (#10485 ) * fix: Await MCP instructions formatting in AgentClient * fix: don't render or aggregate malformed tool calls * fix: implement filter for malformed tool call content parts and add tests	2025-11-13 14:17:47 -05:00
Danny Avila	3f62ce054f	🔢 fix: Unescape LaTeX Numbers in Artifact Content Edit (#10476 )	2025-11-13 08:19:19 -05:00
Danny Avila	2524d33362	📂 refactor: Cleanup File Filtering Logic, Improve Validation (#10414 ) * feat: add filterFilesByEndpointConfig to filter disabled file processing by provider * chore: explicit define of endpointFileConfig for better debugging * refactor: move `normalizeEndpointName` to data-provider as used app-wide * chore: remove overrideEndpoint from useFileHandling * refactor: improve endpoint file config selection * refactor: update filterFilesByEndpointConfig to accept structured parameters and improve endpoint file config handling * refactor: replace defaultFileConfig with getEndpointFileConfig for improved file configuration handling across components * test: add comprehensive unit tests for getEndpointFileConfig to validate endpoint configuration handling * refactor: streamline agent endpoint assignment and improve file filtering logic * feat: add error handling for disabled file uploads in endpoint configuration * refactor: update encodeAndFormat functions to accept structured parameters for provider and endpoint * refactor: streamline requestFiles handling in initializeAgent function * fix: getEndpointFileConfig partial config merging scenarios * refactor: enhance mergeWithDefault function to support document-supported providers with comprehensive MIME types * refactor: user-configured default file config in getEndpointFileConfig * fix: prevent file handling when endpoint is disabled and file is dragged to chat * refactor: move `getEndpointField` to `data-provider` and update usage across components and hooks * fix: prioritize endpointType based on agent.endpoint in file filtering logic * fix: prioritize agent.endpoint in file filtering logic and remove unnecessary endpointType defaulting	2025-11-10 19:05:30 -05:00
Rakshit	772b706e20	🎙️ fix: Azure OpenAI Speech-to-Text 400 Bad Request Error (#10355 )	2025-11-05 10:27:34 -05:00

1 2

80 commits