* feat: Implement new features for Claude Opus 4.6 model
- Added support for tiered pricing based on input token count for the Claude Opus 4.6 model.
- Updated token value calculations to include inputTokenCount for accurate pricing.
- Enhanced transaction handling to apply premium rates when input tokens exceed defined thresholds.
- Introduced comprehensive tests to validate pricing logic for both standard and premium rates across various scenarios.
- Updated related utility functions and models to accommodate new pricing structure.
This change improves the flexibility and accuracy of token pricing for the Claude Opus 4.6 model, ensuring users are charged appropriately based on their usage.
* feat: Add effort field to conversation and preset schemas
- Introduced a new optional `effort` field of type `String` in both the `IPreset` and `IConversation` interfaces.
- Updated the `conversationPreset` schema to include the `effort` field, enhancing the data structure for better context management.
* chore: Clean up unused variable and comments in initialize function
* chore: update dependencies and SDK versions
- Updated @anthropic-ai/sdk to version 0.73.0 in package.json and overrides.
- Updated @anthropic-ai/vertex-sdk to version 0.14.3 in packages/api/package.json.
- Updated @librechat/agents to version 3.1.34 in packages/api/package.json.
- Refactored imports in packages/api/src/endpoints/anthropic/vertex.ts for consistency.
* chore: remove postcss-loader from dependencies
* feat: Bedrock model support for adaptive thinking configuration
- Updated .env.example to include new Bedrock model IDs for Claude Opus 4.6.
- Refactored bedrockInputParser to support adaptive thinking for Opus models, allowing for dynamic thinking configurations.
- Introduced a new function to check model compatibility with adaptive thinking.
- Added an optional `effort` field to the input schemas and updated related configurations.
- Enhanced tests to validate the new adaptive thinking logic and model configurations.
* feat: Add tests for Opus 4.6 adaptive thinking configuration
* feat: Update model references for Opus 4.6 by removing version suffix
* feat: Update @librechat/agents to version 3.1.35 in package.json and package-lock.json
* chore: @librechat/agents to version 3.1.36 in package.json and package-lock.json
* feat: Normalize inputTokenCount for spendTokens and enhance transaction handling
- Introduced normalization for promptTokens to ensure inputTokenCount does not go negative.
- Updated transaction logic to reflect normalized inputTokenCount in pricing calculations.
- Added comprehensive tests to validate the new normalization logic and its impact on transaction rates for both standard and premium models.
- Refactored related functions to improve clarity and maintainability of token value calculations.
* chore: Simplify adaptive thinking configuration in helpers.ts
- Removed unnecessary type casting for the thinking property in updatedOptions.
- Ensured that adaptive thinking is directly assigned when conditions are met, improving code clarity.
* refactor: Replace hard-coded token values with dynamic retrieval from maxTokensMap in model tests
* fix: Ensure non-negative token values in spendTokens calculations
- Updated token value retrieval to use Math.max for prompt and completion tokens, preventing negative values.
- Enhanced clarity in token calculations for both prompt and completion transactions.
* test: Add test for normalization of negative structured token values in spendStructuredTokens
- Implemented a test to ensure that negative structured token values are normalized to zero during token spending.
- Verified that the transaction rates remain consistent with the expected standard values after normalization.
* refactor: Bedrock model support for adaptive thinking and context handling
- Added tests for various alternate naming conventions of Claude models to validate adaptive thinking and context support.
- Refactored `supportsAdaptiveThinking` and `supportsContext1m` functions to utilize new parsing methods for model version extraction.
- Updated `bedrockInputParser` to handle effort configurations more effectively and strip unnecessary fields for non-adaptive models.
- Improved handling of anthropic model configurations in the input parser.
* fix: Improve token value retrieval in getMultiplier function
- Updated the token value retrieval logic to use optional chaining for better safety against undefined values.
- Added a test case to ensure that the function returns the default rate when the provided valueKey does not exist in tokenValues.
* WIP: app.locals refactoring
WIP: appConfig
fix: update memory configuration retrieval to use getAppConfig based on user role
fix: update comment for AppConfig interface to clarify purpose
🏷️ refactor: Update tests to use getAppConfig for endpoint configurations
ci: Update AppService tests to initialize app config instead of app.locals
ci: Integrate getAppConfig into remaining tests
refactor: Update multer storage destination to use promise-based getAppConfig and improve error handling in tests
refactor: Rename initializeAppConfig to setAppConfig and update related tests
ci: Mock getAppConfig in various tests to provide default configurations
refactor: Update convertMCPToolsToPlugins to use mcpManager for server configuration and adjust related tests
chore: rename `Config/getAppConfig` -> `Config/app`
fix: streamline OpenAI image tools configuration by removing direct appConfig dependency and using function parameters
chore: correct parameter documentation for imageOutputType in ToolService.js
refactor: remove `getCustomConfig` dependency in config route
refactor: update domain validation to use appConfig for allowed domains
refactor: use appConfig registration property
chore: remove app parameter from AppService invocation
refactor: update AppConfig interface to correct registration and turnstile configurations
refactor: remove getCustomConfig dependency and use getAppConfig in PluginController, multer, and MCP services
refactor: replace getCustomConfig with getAppConfig in STTService, TTSService, and related files
refactor: replace getCustomConfig with getAppConfig in Conversation and Message models, update tempChatRetention functions to use AppConfig type
refactor: update getAppConfig calls in Conversation and Message models to include user role for temporary chat expiration
ci: update related tests
refactor: update getAppConfig call in getCustomConfigSpeech to include user role
fix: update appConfig usage to access allowedDomains from actions instead of registration
refactor: enhance AppConfig to include fileStrategies and update related file strategy logic
refactor: update imports to use normalizeEndpointName from @librechat/api and remove redundant definitions
chore: remove deprecated unused RunManager
refactor: get balance config primarily from appConfig
refactor: remove customConfig dependency for appConfig and streamline loadConfigModels logic
refactor: remove getCustomConfig usage and use app config in file citations
refactor: consolidate endpoint loading logic into loadEndpoints function
refactor: update appConfig access to use endpoints structure across various services
refactor: implement custom endpoints configuration and streamline endpoint loading logic
refactor: update getAppConfig call to include user role parameter
refactor: streamline endpoint configuration and enhance appConfig usage across services
refactor: replace getMCPAuthMap with getUserMCPAuthMap and remove unused getCustomConfig file
refactor: add type annotation for loadedEndpoints in loadEndpoints function
refactor: move /services/Files/images/parse to TS API
chore: add missing FILE_CITATIONS permission to IRole interface
refactor: restructure toolkits to TS API
refactor: separate manifest logic into its own module
refactor: consolidate tool loading logic into a new tools module for startup logic
refactor: move interface config logic to TS API
refactor: migrate checkEmailConfig to TypeScript and update imports
refactor: add FunctionTool interface and availableTools to AppConfig
refactor: decouple caching and DB operations from AppService, make part of consolidated `getAppConfig`
WIP: fix tests
* fix: rebase conflicts
* refactor: remove app.locals references
* refactor: replace getBalanceConfig with getAppConfig in various strategies and middleware
* refactor: replace appConfig?.balance with getBalanceConfig in various controllers and clients
* test: add balance configuration to titleConvo method in AgentClient tests
* chore: remove unused `openai-chat-tokens` package
* chore: remove unused imports in initializeMCPs.js
* refactor: update balance configuration to use getAppConfig instead of getBalanceConfig
* refactor: integrate configMiddleware for centralized configuration handling
* refactor: optimize email domain validation by removing unnecessary async calls
* refactor: simplify multer storage configuration by removing async calls
* refactor: reorder imports for better readability in user.js
* refactor: replace getAppConfig calls with req.config for improved performance
* chore: replace getAppConfig calls with req.config in tests for centralized configuration handling
* chore: remove unused override config
* refactor: add configMiddleware to endpoint route and replace getAppConfig with req.config
* chore: remove customConfig parameter from TTSService constructor
* refactor: pass appConfig from request to processFileCitations for improved configuration handling
* refactor: remove configMiddleware from endpoint route and retrieve appConfig directly in getEndpointsConfig if not in `req.config`
* test: add mockAppConfig to processFileCitations tests for improved configuration handling
* fix: pass req.config to hasCustomUserVars and call without await after synchronous refactor
* fix: type safety in useExportConversation
* refactor: retrieve appConfig using getAppConfig in PluginController and remove configMiddleware from plugins route, to avoid always retrieving when plugins are cached
* chore: change `MongoUser` typedef to `IUser`
* fix: Add `user` and `config` fields to ServerRequest and update JSDoc type annotations from Express.Request to ServerRequest
* fix: remove unused setAppConfig mock from Server configuration tests
* refactor: move model definitions and database-related methods to packages/data-schemas
* ci: update tests due to new DB structure
fix: disable mocking `librechat-data-provider`
feat: Add schema exports to data-schemas package
- Introduced a new schema module that exports various schemas including action, agent, and user schemas.
- Updated index.ts to include the new schema exports for better modularity and organization.
ci: fix appleStrategy tests
fix: Agent.spec.js
ci: refactor handleTools tests to use MongoMemoryServer for in-memory database
fix: getLogStores imports
ci: update banViolation tests to use MongoMemoryServer and improve session mocking
test: refactor samlStrategy tests to improve mock configurations and user handling
ci: fix crypto mock in handleText tests for improved accuracy
ci: refactor spendTokens tests to improve model imports and setup
ci: refactor Message model tests to use MongoMemoryServer and improve database interactions
* refactor: streamline IMessage interface and move feedback properties to types/message.ts
* refactor: use exported initializeRoles from `data-schemas`, remove api workspace version (this serves as an example of future migrations that still need to happen)
* refactor: update model imports to use destructuring from `~/db/models` for consistency and clarity
* refactor: remove unused mongoose imports from model files for cleaner code
* refactor: remove unused mongoose imports from Share, Prompt, and Transaction model files for cleaner code
* refactor: remove unused import in Transaction model for cleaner code
* ci: update deploy workflow to reference new Docker Dev Branch Images Build and add new workflow for building Docker images on dev branch
* chore: cleanup imports
* fix: Implement optimistic concurrency control for balance updates in Transaction model to allow for documentdb compatibility
* test: Add concurrent balance increase test for auto refill transactions
* 🏗️ refactor: Improve spendTokens logic to handle zero completion tokens and enhance test coverage
* 🏗️ test: Add tests to ensure balance does not go below zero when spending tokens
* 🏗️ fix: Ensure proper continuation in AgentClient when handling errors
* fix: spend token race conditions
* 🏗️ test: Add test for handling multiple concurrent transactions with high balance
* fix: Handle Omni models prompt prefix handling for user messages with array content in OpenAIClient
* refactor: Update checkBalance import paths to use new balanceMethods module
* refactor: Update checkBalance imports and implement updateBalance function for atomic balance updates
* fix: import from replace method
* feat: Add createAutoRefillTransaction method to handle non-balance updating transactions
* refactor: Move auto-refill logic to balanceMethods and enhance checkBalance functionality
* feat: Implement logging for auto-refill transactions in balance checks
* refactor: Remove logRefill calls from multiple client and handler files
* refactor: Move balance checking and auto-refill logic to balanceMethods for improved structure
* refactor: Simplify balance check calls by removing unnecessary balanceRecord assignments
* fix: Prevent negative rawAmount in spendTokens when promptTokens is zero
* fix: Update balanceMethods to use Balance model for findOneAndUpdate
* chore: import order
* refactor: remove unused txMethods file to streamline codebase
* feat: enhance updateBalance and createAutoRefillTransaction methods to support additional parameters for improved balance management
* 🚀 feat: Add automatic refill settings to balance schema
* 🚀 feat: Refactor balance feature to use global interface configuration
* 🚀 feat: Implement auto-refill functionality for balance management
* 🚀 feat: Enhance auto-refill logic and configuration for balance management
* 🚀 chore: Bump version to 0.7.74 in package.json and package-lock.json
* 🚀 chore: Bump version to 0.0.5 in package.json and package-lock.json
* 🚀 docs: Update comment for balance settings in librechat.example.yaml
* chore: space in `.env.example`
* 🚀 feat: Implement balance configuration loading and refactor related components
* 🚀 test: Refactor tests to use custom config for balance feature
* 🚀 fix: Update balance response handling in Transaction.js to use Balance model
* 🚀 test: Update AppService tests to include balance configuration in mock setup
* 🚀 test: Enhance AppService tests with complete balance configuration scenarios
* 🚀 refactor: Rename balanceConfig to balance and update related tests for clarity
* 🚀 refactor: Remove loadDefaultBalance and update balance handling in AppService
* 🚀 test: Update AppService tests to reflect new balance structure and defaults
* 🚀 test: Mock getCustomConfig in BaseClient tests to control balance configuration
* 🚀 test: Add get method to mockCache in OpenAIClient tests for improved cache handling
* 🚀 test: Mock getCustomConfig in OpenAIClient tests to control balance configuration
* 🚀 test: Remove mock for getCustomConfig in OpenAIClient tests to streamline configuration handling
* 🚀 fix: Update balance configuration reference in config.js for consistency
* refactor: Add getBalanceConfig function to retrieve balance configuration
* chore: Comment out example balance settings in librechat.example.yaml
* refactor: Replace getCustomConfig with getBalanceConfig for balance handling
* fix: tests
* refactor: Replace getBalanceConfig call with balance from request locals
* refactor: Update balance handling to use environment variables for configuration
* refactor: Replace getBalanceConfig calls with balance from request locals
* refactor: Simplify balance configuration logic in getBalanceConfig
---------
Co-authored-by: Danny Avila <danny@librechat.ai>
* feat: Refactor ModelEndHandler to collect usage metadata only if it exists
* feat: google tool end handling, custom anthropic class for better token ux
* refactor: differentiate between client <> request options
* feat: initial support for google agents
* feat: only cache messages with non-empty text
* feat: Cache non-empty messages in chatV2 controller
* fix: anthropic llm client options llmConfig
* refactor: streamline client options handling in LLM configuration
* fix: VertexAI Agent Auth & Tool Handling
* fix: additional fields for llmConfig, however customHeaders are not supported by langchain, requires PR
* feat: set default location for vertexai LLM configuration
* fix: outdated OpenAI Client options for getLLMConfig
* chore: agent provider options typing
* chore: add note about currently unsupported customHeaders in langchain GenAI client
* fix: skip transaction creation when rawAmount is NaN
* wip: initial cache control implementation, add typing for transactions handling
* feat: first pass of Anthropic Prompt Caching
* feat: standardize stream usage as pass in when calculating token counts
* feat: Add getCacheMultiplier function to calculate cache multiplier for different valueKeys and cacheTypes
* chore: imports order
* refactor: token usage recording in AnthropicClient, no need to "correct" as we have the correct amount
* feat: more accurate token counting using stream usage data
* feat: Improve token counting accuracy with stream usage data
* refactor: ensure more accurate than not token estimations if custom instructions or files are not being resent with every request
* refactor: cleanup updateUserMessageTokenCount to allow transactions to be as accurate as possible even if we shouldn't update user message token counts
* ci: fix tests
* fix(processModelData): handle `openrouter/auto` edge case
* fix(Tx.create): prevent negative multiplier edge case and prevent balance from becoming negative
* fix(NavLinks): render 0 balance properly
* refactor(NavLinks): show only up to 2 decimal places for balance
* fix(OpenAIClient/titleConvo): fix cohere condition and record token usage for `this.options.titleMethod === 'completion'`
* WIP: basic route for file downloads and file strategy for generating readablestream to pipe as res
* chore(DALLE3): add typing for OpenAI client
* chore: add `CONSOLE_JSON` notes to dotenv.md
* WIP: first pass OpenAI Assistants File Output handling
* feat: first pass assistants output file download from openai
* chore: yml vs. yaml variation to .gitignore for `librechat.yml`
* refactor(retrieveAndProcessFile): remove redundancies
* fix(syncMessages): explicit sort of apiMessages to fix message order on abort
* chore: add logs for warnings and errors, show toast on frontend
* chore: add logger where console was still being used
* chore: add assistants to supportsBalanceCheck
* feat(Transaction): getTransactions and refactor export of model
* refactor: use enum: ViolationTypes.TOKEN_BALANCE
* feat(assistants): check balance
* refactor(assistants): only add promptBuffer if new convo (for title), and remove endpoint definition
* refactor(assistants): Count tokens up to the current context window
* fix(Switcher): make Select list explicitly controlled
* feat(assistants): use assistant's default model when no model is specified instead of the last selected assistant, prevent assistant_id from being recorded in non-assistant endpoints
* chore(assistants/chat): import order
* chore: bump librechat-data-provider due to changes
* chore: bump anthropic SDK
* chore: update anthropic config settings (fileSupport, default models)
* feat: anthropic multi modal formatting
* refactor: update vision models and use endpoint specific max long side resizing
* feat(anthropic): multimodal messages, retry logic, and messages payload
* chore: add more safety to trimming content due to whitespace error for assistant messages
* feat(anthropic): token accounting and resending multiple images in progress
* chore: bump data-provider
* feat(anthropic): resendImages feature
* chore: optimize Edit/Ask controllers, switch model back to req model
* fix: false positive of invalid model
* refactor(validateVisionModel): use object as arg, pass in additional/available models
* refactor(validateModel): use helper function, `getModelsConfig`
* feat: add modelsConfig to endpointOption so it gets passed to all clients, use for properly validating vision models
* refactor: initialize default vision model and make sure it's available before assigning it
* refactor(useSSE): avoid resetting model if user selected a new model between request and response
* feat: show rate in transaction logging
* fix: return tokenCountMap regardless of payload shape
* feat: add GOOGLE_MODELS env var
* feat: add gemini vision support
* refactor(GoogleClient): adjust clientOptions handling depending on model
* fix(logger): fix redact logic and redact errors only
* fix(GoogleClient): do not allow non-multiModal messages when gemini-pro-vision is selected
* refactor(OpenAIClient): use `isVisionModel` client property to avoid calling validateVisionModel multiple times
* refactor: better debug logging by correctly traversing, redacting sensitive info, and logging condensed versions of long values
* refactor(GoogleClient): allow response errors to be thrown/caught above client handling so user receives meaningful error message
debug orderedMessages, parentMessageId, and buildMessages result
* refactor(AskController): use model from client.modelOptions.model when saving intermediate messages, which requires for the progress callback to be initialized after the client is initialized
* feat(useSSE): revert to previous model if the model was auto-switched by backend due to message attachments
* docs: update with google updates, notes about Gemini Pro Vision
* fix: redis should not be initialized without USE_REDIS and increase max listeners to 20
* refactor(Chains/llms): allow passing callbacks
* refactor(BaseClient): accurately count completion tokens as generation only
* refactor(OpenAIClient): remove unused getTokenCountForResponse, pass streaming var and callbacks in initializeLLM
* wip: summary prompt tokens
* refactor(summarizeMessages): new cut-off strategy that generates a better summary by adding context from beginning, truncating the middle, and providing the end
wip: draft out relevant providers and variables for token tracing
* refactor(createLLM): make streaming prop false by default
* chore: remove use of getTokenCountForResponse
* refactor(agents): use BufferMemory as ConversationSummaryBufferMemory token usage not easy to trace
* chore: remove passing of streaming prop, also console log useful vars for tracing
* feat: formatFromLangChain helper function to count tokens for ChatModelStart
* refactor(initializeLLM): add role for LLM tracing
* chore(formatFromLangChain): update JSDoc
* feat(formatMessages): formats langChain messages into OpenAI payload format
* chore: install openai-chat-tokens
* refactor(formatMessage): optimize conditional langChain logic
fix(formatFromLangChain): fix destructuring
* feat: accurate prompt tokens for ChatModelStart before generation
* refactor(handleChatModelStart): move to callbacks dir, use factory function
* refactor(initializeLLM): rename 'role' to 'context'
* feat(Balance/Transaction): new schema/models for tracking token spend
refactor(Key): factor out model export to separate file
* refactor(initializeClient): add req,res objects to client options
* feat: add-balance script to add to an existing users' token balance
refactor(Transaction): use multiplier map/function, return balance update
* refactor(Tx): update enum for tokenType, return 1 for multiplier if no map match
* refactor(Tx): add fair fallback value multiplier incase the config result is undefined
* refactor(Balance): rename 'tokens' to 'tokenCredits'
* feat: balance check, add tx.js for new tx-related methods and tests
* chore(summaryPrompts): update prompt token count
* refactor(callbacks): pass req, res
wip: check balance
* refactor(Tx): make convoId a String type, fix(calculateTokenValue)
* refactor(BaseClient): add conversationId as client prop when assigned
* feat(RunManager): track LLM runs with manager, track token spend from LLM,
refactor(OpenAIClient): use RunManager to create callbacks, pass user prop to langchain api calls
* feat(spendTokens): helper to spend prompt/completion tokens
* feat(checkBalance): add helper to check, log, deny request if balance doesn't have enough funds
refactor(Balance): static check method to return object instead of boolean now
wip(OpenAIClient): implement use of checkBalance
* refactor(initializeLLM): add token buffer to assure summary isn't generated when subsequent payload is too large
refactor(OpenAIClient): add checkBalance
refactor(createStartHandler): add checkBalance
* chore: remove prompt and completion token logging from route handler
* chore(spendTokens): add JSDoc
* feat(logTokenCost): record transactions for basic api calls
* chore(ask/edit): invoke getResponseSender only once per API call
* refactor(ask/edit): pass promptTokens to getIds and include in abort data
* refactor(getIds -> getReqData): rename function
* refactor(Tx): increase value if incomplete message
* feat: record tokenUsage when message is aborted
* refactor: subtract tokens when payload includes function_call
* refactor: add namespace for token_balance
* fix(spendTokens): only execute if corresponding token type amounts are defined
* refactor(checkBalance): throws Error if not enough token credits
* refactor(runTitleChain): pass and use signal, spread object props in create helpers, and use 'call' instead of 'run'
* fix(abortMiddleware): circular dependency, and default to empty string for completionTokens
* fix: properly cancel title requests when there isn't enough tokens to generate
* feat(predictNewSummary): custom chain for summaries to allow signal passing
refactor(summaryBuffer): use new custom chain
* feat(RunManager): add getRunByConversationId method, refactor: remove run and throw llm error on handleLLMError
* refactor(createStartHandler): if summary, add error details to runs
* fix(OpenAIClient): support aborting from summarization & showing error to user
refactor(summarizeMessages): remove unnecessary operations counting summaryPromptTokens and note for alternative, pass signal to summaryBuffer
* refactor(logTokenCost -> recordTokenUsage): rename
* refactor(checkBalance): include promptTokens in errorMessage
* refactor(checkBalance/spendTokens): move to models dir
* fix(createLanguageChain): correctly pass config
* refactor(initializeLLM/title): add tokenBuffer of 150 for balance check
* refactor(openAPIPlugin): pass signal and memory, filter functions by the one being called
* refactor(createStartHandler): add error to run if context is plugins as well
* refactor(RunManager/handleLLMError): throw error immediately if plugins, don't remove run
* refactor(PluginsClient): pass memory and signal to tools, cleanup error handling logic
* chore: use absolute equality for addTitle condition
* refactor(checkBalance): move checkBalance to execute after userMessage and tokenCounts are saved, also make conditional
* style: icon changes to match official
* fix(BaseClient): getTokenCountForResponse -> getTokenCount
* fix(formatLangChainMessages): add kwargs as fallback prop from lc_kwargs, update JSDoc
* refactor(Tx.create): does not update balance if CHECK_BALANCE is not enabled
* fix(e2e/cleanUp): cleanup new collections, import all model methods from index
* fix(config/add-balance): add uncaughtException listener
* fix: circular dependency
* refactor(initializeLLM/checkBalance): append new generations to errorMessage if cost exceeds balance
* fix(handleResponseMessage): only record token usage in this method if not error and completion is not skipped
* fix(createStartHandler): correct condition for generations
* chore: bump postcss due to moderate severity vulnerability
* chore: bump zod due to low severity vulnerability
* chore: bump openai & data-provider version
* feat(types): OpenAI Message types
* chore: update bun lockfile
* refactor(CodeBlock): add error block formatting
* refactor(utils/Plugin): factor out formatJSON and cn to separate files (json.ts and cn.ts), add extractJSON
* chore(logViolation): delete user_id after error is logged
* refactor(getMessageError -> Error): change to React.FC, add token_balance handling, use extractJSON to determine JSON instead of regex
* fix(DALL-E): use latest openai SDK
* chore: reorganize imports, fix type issue
* feat(server): add balance route
* fix(api/models): add auth
* feat(data-provider): /api/balance query
* feat: show balance if checking is enabled, refetch on final message or error
* chore: update docs, .env.example with token_usage info, add balance script command
* fix(Balance): fallback to empty obj for balance query
* style: slight adjustment of balance element
* docs(token_usage): add PR notes