Commit graph

47 commits

Author SHA1 Message Date
Danny Avila
430557676d
feat: GPT-5 Token Limits, Rates, Icon, Reasoning Support 2025-08-07 16:23:11 -04:00
Danny Avila
d95d8032cc
feat: GPT-OSS models Token Limits & Rates 2025-08-07 16:23:10 -04:00
Danny Avila
1e4f1f780c
🔑 feat: Grok 4 Pricing and Token Limits (#8395)
* 🔑 feat: Grok 4 Pricing and Token Limits

* 🔑 feat: Update Grok 3 Pricing for Mini and Fast Models
2025-07-11 03:24:13 -04:00
Robin Anderson
dba0ec4320
🔧 chore: update pricing for OpenAI o3 (#7948)
`o3` is now 80% cheaper, at $2/Mt input and $8/Mt output.
https://openai.com/api/pricing/
2025-06-17 21:27:31 -04:00
Danny Avila
a2f330e6ca
🦾 feat: Claude-4 Support (#7509)
* refactor: Update AnthropicClient to support Claude model naming changes

* Renamed `isClaude3` to `isClaudeLatest` to accommodate newer Claude models.
* Updated logic to determine if the model is part of the Claude family.
* Adjusted `useMessages` property to reflect the new model naming convention.
* Cleaned up client properties during disposal to match the updated naming.

* feat: Claude-4 Support

* feat: Add Thinking and Prompt caching support for Claude 4

* chore: Update ANTHROPIC_MODELS in .env.example for latest model versions
2025-05-22 15:00:44 -04:00
Danny Avila
66093b1eb3
💬 refactor: MCP Chat Visibility Option, Google Rates, Remove OpenAPI Plugins (#7286)
* fix: Update Gemini 2.5 Pro Preview Model Name in Token Values

* refactor: Update DeleteButton to close menu when deletion is successful

* refactor: Add unmountOnHide prop to DropdownPopup in multiple components

* chore: linting

* chore: linting

* feat: Add `chatMenu` option for MCP Servers to control visibility in MCPSelect dropdown

* refactor: Update loadManifestTools to return combined tool manifest with MCP tools first

* chore: remove deprecated openapi plugins

* chore: linting

* chore(AgentClient): linting, remove unnecessary `checkVisionRequest` logger

* refactor(AuthService): change logoutUser logging from error to debug level

* chore: new Gemini models token values and rates

* chore(AskController): linting
2025-05-08 12:12:36 -04:00
Danny Avila
37b50736bc
🔧 fix: Google Gemma Support & OpenAI Reasoning Instructions (#7196)
* 🔄 chore: Update @langchain/google-vertexai to version 0.2.5 in package.json and package-lock.json

* chore: temp remove agents

* 🔄 chore: Update @langchain/google-genai to version 0.2.5 in package.json and package-lock.json

* 🔄 chore: Update @langchain/community to version 0.3.42 in package.json and package-lock.json

* 🔄 chore: Add license information for @langchain/textsplitters in package-lock.json

* 🔄 chore: Update @langchain/core to version 0.3.51 in package.json and package-lock.json

* 🔄 chore: Update openai dependency to version 4.96.2 in package.json and package-lock.json

* chore: @librechat/agents to v2.4.30

* fix: streaming condition in ModelEndHandler to account for boundModel `disableStreaming` setting

* fix: update regex for noSystemModel and refactor message handling in AgentClient

* feat: Google Gemma models

* chore: remove unnecessary empty JSX fragment in PopoverButtons component
2025-05-02 15:11:50 -04:00
Danny Avila
7f1d01c35a
🔀 fix: MCP Improvements, Auto-Save Drafts, Artifact Markup (#7040)
* feat: Update MCP tool creation to use lowercase provider name

* refactor: handle MCP image output edge cases where tool outputs must contain string responses

* feat: Drop 'anyOf' and 'oneOf' fields from JSON schema conversion

* feat: Transform 'oneOf' and 'anyOf' fields to Zod union in JSON schema conversion

* fix: artifactPlugin to replace textDirective with expected text, closes #7029

* fix: auto-save functionality to handle conversation transitions from pending drafts, closes #7027

* refactor: improve async handling during user disconnection process

* fix: use correct user ID variable for MCP tool calling

* fix: improve handling of pending drafts in auto-save functionality

* fix: add support for additional model names in getValueKey function

* fix: reset form values on agent deletion when no agents remain
2025-04-23 18:56:06 -04:00
Danny Avila
52f146dd97
🤖 feat: Support o4-mini and o3 Models (#6928)
* feat: Add support for new OpenAI models (o4-mini, o3) and update related logic

* 🔧 fix: Rename 'resubmitFiles' to 'isResubmission' for consistency across types and hooks

* 🔧 fix: Replace hardcoded 'pending_req' with CacheKeys.PENDING_REQ for consistency in cache handling

* 🔧 fix: Update cache handling to use Time.ONE_MINUTE instead of hardcoded TTL and streamline imports

* 🔧 fix: Enhance message handling logic to correctly identify parent messages and streamline imports in useSSE
2025-04-17 00:40:26 -04:00
Danny Avila
52b3ed54ca
🤖 feat: GPT-4.1 (#6880)
* fix: Agent Builder setting not applying in useSideNavLinks

* fix: Remove unused type imports in useSideNavLinks

* feat: gpt-4.1

* fix: Update getCacheMultiplier and getMultiplier tests to use dynamic token values

* feat: Add gpt-4.1 to the list of vision models

* chore: Bump version of librechat-data-provider to 0.7.792
2025-04-14 14:55:59 -04:00
Danny Avila
37964975c1
🤖 refactor: Improve Agents Memory Usage, Bump Keyv, Grok 3 (#6850)
* chore: remove unused redis file

* chore: bump keyv dependencies, and update related imports

* refactor: Implement IoRedis client for rate limiting across middleware, as node-redis via keyv not compatible

* fix: Set max listeners to expected amount

* WIP: memory improvements

* refactor: Simplify getAbortData assignment in createAbortController

* refactor: Update getAbortData to use WeakRef for content management

* WIP: memory improvements in agent chat requests

* refactor: Enhance memory management with finalization registry and cleanup functions

* refactor: Simplify domainParser calls by removing unnecessary request parameter

* refactor: Update parameter types for action tools and agent loading functions to use minimal configs

* refactor: Simplify domainParser tests by removing unnecessary request parameter

* refactor: Simplify domainParser call by removing unnecessary request parameter

* refactor: Enhance client disposal by nullifying additional properties to improve memory management

* refactor: Improve title generation by adding abort controller and timeout handling, consolidate request cleanup

* refactor: Update checkIdleConnections to skip current user when checking for idle connections if passed

* refactor: Update createMCPTool to derive userId from config and handle abort signals

* refactor: Introduce createTokenCounter function and update tokenCounter usage; enhance disposeClient to reset Graph values

* refactor: Update getMCPManager to accept userId parameter for improved idle connection handling

* refactor: Extract logToolError function for improved error handling in AgentClient

* refactor: Update disposeClient to clear handlerRegistry and graphRunnable references in client.run

* refactor: Extract createHandleNewToken function to streamline token handling in initializeClient

* chore: bump @librechat/agents

* refactor: Improve timeout handling in addTitle function for better error management

* refactor: Introduce createFetch instead of using class method

* refactor: Enhance client disposal and request data handling in AskController and EditController

* refactor: Update import statements for AnthropicClient and OpenAIClient to use specific paths

* refactor: Use WeakRef for response handling in SplitStreamHandler to prevent memory leaks

* refactor: Simplify client disposal and rename getReqData to processReqData in AskController and EditController

* refactor: Improve logging structure and parameter handling in OpenAIClient

* refactor: Remove unused GraphEvents and improve stream event handling in AnthropicClient and OpenAIClient

* refactor: Simplify client initialization in AskController and EditController

* refactor: Remove unused mock functions and implement in-memory store for KeyvMongo

* chore: Update dependencies in package-lock.json to latest versions

* refactor: Await token usage recording in OpenAIClient to ensure proper async handling

* refactor: Remove handleAbort route from multiple endpoints and enhance client disposal logic

* refactor: Enhance abort controller logic by managing abortKey more effectively

* refactor: Add newConversation handling in useEventHandlers for improved conversation management

* fix: dropparams

* refactor: Use optional chaining for safer access to request properties in BaseClient

* refactor: Move client disposal and request data processing logic to cleanup module for better organization

* refactor: Remove aborted request check from addTitle function for cleaner logic

* feat: Add Grok 3 model pricing and update tests for new models

* chore: Remove trace warnings and inspect flags from backend start script used for debugging

* refactor: Replace user identifier handling with userId for consistency across controllers, use UserId in clientRegistry

* refactor: Enhance client disposal logic to prevent memory leaks by clearing additional references

* chore: Update @librechat/agents to version 2.4.14 in package.json and package-lock.json
2025-04-12 18:46:36 -04:00
RedwindA
93e679e173
🪙 chore: Update Gemini Pricing (#6731) 2025-04-04 11:55:07 -04:00
Danny Avila
a6f062e468
🚀 feat: Add Gemini 2.5 Token/Context Values, Increase Max Possible Output to 64k (#6563)
* feat: Add Gemini 2.5 token values, increase max output param, context window

* 🔧 fix: Update Gemini API model names in .env.example

* 🔧 fix: Add button type attribute to AttachFile component
2025-03-27 11:09:20 -04:00
Danny Avila
7ca5650840
🔧 fix: Mistral type strictness for usage & update token values/windows (#6562)
* 🔧 fix: Resolve Mistral type strictness for OpenAI usage field

* chore: Enable usage tracking for Mistral endpoint in OpenAI configuration

* chore: Add new token values and context windows for latest premier Mistral models
2025-03-27 01:57:25 -04:00
Danny Avila
d6a17784dc
🔗 feat: Agent Chain (Mixture-of-Agents) (#6374)
* wip: first pass, dropdown for selecting sequential agents

* refactor: Improve agent selection logic and enhance performance in SequentialAgents component

* wip: seq. agents working ideas

* wip: sequential agents style change

* refactor: move agent form options/submission outside of AgentConfig

* refactor: prevent repeating code

* refactor: simplify current agent display in SequentialAgents component

* feat: persist  form value handling in AgentSelect component for agent_ids

* feat: first pass, sequential agnets agent update

* feat: enhance message display with agent updates and empty text handling

* chore: update Icon component to use EModelEndpoint for agent endpoints

* feat: update content type checks in BaseClient to use constants for better readability

* feat: adjust max context tokens calculation to use 90% of the model's max tokens

* feat: first pass, agent run message pruning

* chore: increase max listeners for abort controller to prevent memory leaks

* feat: enhance runAgent function to include current index count map for improved token tracking

* chore: update @librechat/agents dependency to version 2.2.5

* feat: update icons and style of SequentialAgents component for improved UI consistency

* feat: add AdvancedButton and AdvancedPanel components for enhanced agent settings navigation, update styling for agent form

* chore: adjust minimum height of AdvancedPanel component for better layout consistency

* chore: update @librechat/agents dependency to version 2.2.6

* feat: enhance message formatting by incorporating tool set into agent message processing, in order to allow better mix/matching of agents (as tool calls for tools not found in set will be stringified)

* refactor: reorder components in AgentConfig for improved readability and maintainability

* refactor: enhance layout of AgentUpdate component for improved visual structure

* feat: add DeepSeek provider to Bedrock settings and schemas

* feat: enhance link styling in mobile.css for better visibility and accessibility

* fix: update banner model import in update banner script; export Banner model

* refactor: `duplicateAgentHandler` to include tool_resources only for OCR context files

* feat: add 'qwen-vl' to visionModels for enhanced model support

* fix: change image format from JPEG to PNG in DALLE3 response

* feat: reorganize Advanced components and add localizations

* refactor: simplify JSX structure in AgentChain component to defer container styling to parent

* feat: add FormInput component for reusable input handling

* feat: make agent recursion limit configurable from builder

* feat: add support for agent capabilities chain in AdvancedPanel and update data-provider version

* feat: add maxRecursionLimit configuration for agents and update related documentation

* fix: update CONFIG_VERSION to 1.2.3 in data provider configuration

* feat: replace recursion limit input with MaxAgentSteps component and enhance input handling

* feat: enhance AgentChain component with hover card for additional information and update related labels

* fix: pass request and response objects to `createActionTool` when using assistant actions to prevent auth error

* feat: update AgentChain component layout to include agent count display

* feat: increase default max listeners and implement capability check function for agent chain

* fix: update link styles in mobile.css for better visibility in dark mode

* chore: temp. remove agents package while bumping shared packages

* chore: update @langchain/google-genai package to version 0.1.11

* chore: update @langchain/google-vertexai package to version 0.2.2

* chore: add @librechat/agents package at version 2.2.8

* feat: add deepseek.r1 model with token rate and context values for bedrock
2025-03-17 16:43:44 -04:00
Danny Avila
2293cd667e
🚀 feat: GPT-4.5, Anthropic Tool Header, and OpenAPI Ref Resolution (#6118)
* 🔧 refactor: Update settings to use 'as const' for improved type safety and make gpt-4o-mini default model (cheapest)

* 📖 docs: Update README to reflect support for GPT-4.5 in image analysis feature

* 🔧 refactor: Update model handling to use default settings and improve encoding logic

* 🔧 refactor: Enhance model version extraction logic for improved compatibility with future GPT and omni models

* feat: GPT-4.5 tx/token update, vision support

* fix: $ref resolution logic in OpenAPI handling

* feat: add new 'anthropic-beta' header for Claude 3.7 to include token-efficient tools; ref: https://docs.anthropic.com/en/docs/build-with-claude/tool-use/token-efficient-tool-use
2025-02-28 12:19:21 -05:00
Danny Avila
50e8769340
🚀 feat: Claude 3.7 Support + Reasoning (#6008)
* fix: missing console color methods for admin scripts

* feat: Anthropic Claude 3.7 Sonnet Support

* feat: update eventsource to version 3.0.2 and upgrade @modelcontextprotocol/sdk to 1.4.1

* fix: update DynamicInput to handle number type and improve initial value logic

* feat: first pass Anthropic Reasoning (Claude 3.7)

* feat: implement streaming support in AnthropicClient with reasoning UI handling

* feat: add missing xAI (grok) models
2025-02-24 20:08:55 -05:00
Danny Avila
63afb317c6
🚀 fix: Resolve Google Client Issues, CDN Screenshots, Update Models (#5703)
* 🤖 refactor: streamline model selection logic for title model in GoogleClient

* refactor: add options for empty object schemas in convertJsonSchemaToZod

* refactor: add utility function to check for empty object schemas in convertJsonSchemaToZod

* fix: Google MCP Tool errors, and remove Object Unescaping as Google fixed this

* fix: google safetySettings

* feat: add safety settings exclusion via GOOGLE_EXCLUDE_SAFETY_SETTINGS environment variable

* fix: rename environment variable for console JSON string length

* fix: disable portal for dropdown in ExportModal component

* fix: screenshot functionality to use image placeholder for remote images

* feat: add visionMode property to BaseClient and initialize in GoogleClient to fix resendFiles issue

* fix: enhance formatMessages to include image URLs in message content for Vertex AI

* fix: safety settings for titleChatCompletion

* fix: remove deprecated model assignment in GoogleClient and streamline title model retrieval

* fix: remove unused image preloading logic in ScreenshotContext

* chore: update default google models to latest models shared by vertex ai and gen ai

* refactor: enhance Google error messaging

* fix: update token values and model limits for Gemini models

* ci: fix model matching

* chore: bump version of librechat-data-provider to 0.7.699
2025-02-06 18:13:18 -05:00
Danny Avila
33f6093775
🤖 feat: o3-mini (#5581)
* 🤖 feat: `o3-mini`

* chore: re-order vision models list to prioritize gpt-4o as a vision model over o1
2025-01-31 16:49:01 -05:00
Danny Avila
8b31f255f5
🪙 fix: Deepseek Pricing 2025-01-25 10:13:46 -05:00
Danny Avila
60c846b679
🪙 fix: Deepseek Pricing & Titling (#5459) 2025-01-25 10:10:53 -05:00
Danny Avila
87383fec27
🔧 chore: Update Deepseek Pricing, Google Safety Settings (#5409)
* fix: google-thinking model safety settings fix

* chore: update pricing/context for deepseek models

* ci: update Deepseek model token limits to use dynamic mapping
2025-01-22 07:50:09 -05:00
Danny Avila
16eed5f32d
🦙 feat: update AWS Bedrock pricing and token metadata for Meta models (#5024) 2024-12-17 17:18:49 -05:00
Danny Avila
b5c9144127
🚀 feat: Add Gemini 2.0 Support, Update Packages and Deprecations (#4951)
* chore: Comment out deprecated MongoDB connection options in connectDb.js

* replaced deprecrated MongoDB count() function with countDocuments()

* npm audit fix (package-lock cleanup)

* chore: Specify .env file in launch configuration

* feat: gemini-2.0

* chore: bump express to 4.21.2 to address CVE-2024-52798

* chore: remove redundant comment for .env file specification in launch configuration

---------

Co-authored-by: neturmel <neturmel@gmx.de>
2024-12-11 14:11:27 -05:00
Danny Avila
ebae494337
🤖 feat: Support for new AWS Nova Models & Updated Anthropic Rates (#4852)
* updated Claude 3.5 Haiku pricing

https://www.anthropic.com/pricing#anthropic-api
Claude 3.5 Haiku
$0.80 / MTok
Input
$1 / MTok
Prompt caching write
$0.08 / MTok
Prompt caching read
$4 / MTok
Output

* Update tx.js

* refactor: fix tests for cache multiplier and add new AWS models

---------

Co-authored-by: khfung <68192841+khfung@users.noreply.github.com>
2024-12-03 22:25:15 -05:00
Danny Avila
fc41032923
🤖 feat: Claude 3.5 Haiku (#4629) 2024-11-04 15:10:24 -05:00
Danny Avila
95011ce349
🚧 WIP: Merge Dev Build (#4611)
* refactor: Agent CodeFiles, abortUpload WIP

* feat: code environment file upload

* refactor: useLazyEffect

* refactor:
- Add `watch` from `useFormContext` to check if code execution is enabled
- Disable file upload button if `agent_id` is not selected or code execution is disabled

* WIP: primeCodeFiles; refactor: rename sessionId to session_id for uniformity

* Refactor: Rename session_id to sessionId for uniformity in AuthService.js

* chore: bump @librechat/agents to version 1.7.1

* WIP: prime code files

* refactor: Update code env file upload method to use read stream

* feat: reupload code env file if no longer active

* refactor: isAssistantTool -> isEntityTool + address type issues

* feat: execute code tool hook

* refactor: Rename isPluginAuthenticated to checkPluginAuth in PluginController.js

* refactor: Update PluginController.js to use AuthType constant for comparison

* feat: verify tool authentication (execute_code)

* feat: enter librechat_code_api_key

* refactor: Remove unused imports in BookmarkForm.tsx

* feat: authenticate code tool

* refactor: Update Action.tsx to conditionally render the key and revoke key buttons

* refactor(Code/Action): prevent uncheck-able 'Run Code' capability when key is revoked

* refactor(Code/Action): Update Action.tsx to conditionally render the key and revoke key buttons

* fix: agent file upload edge cases

* chore: bump @librechat/agents

* fix: custom endpoint providerValue icon

* feat: ollama meta modal token values + context

* feat: ollama agents

* refactor: Update token models for Ollama models

* chore: Comment out CodeForm

* refactor: Update token models for Ollama and Meta models
2024-11-01 18:36:39 -04:00
Hongkai Ye
bf5b87e0b2
🪙 feat: Update token value for gpt-4o (#4387) 2024-10-11 08:27:29 -04:00
Danny Avila
45b42830a5
🚀 feat: o1 (#4019)
* feat: o1 default response sender string

* feat: add o1 models to default openai models list, add `no_system_messages` error type; refactor: use error type as localization key

* refactor(MessageEndpointIcon): differentiate openAI icon model color for o1 models

* refactor(AnthropicClient): use new input/output tokens keys; add prompt caching for claude-3-opus

* refactor(BaseClient): to use new input/output tokens keys; update typedefs

* feat: initial o1 model handling, including token cost complexity

* EXPERIMENTAL: special handling for o1 model with custom instructions
2024-09-12 18:15:43 -04:00
Danny Avila
d59b62174f
🪨 feat: AWS Bedrock support (#3935)
* feat: Add BedrockIcon component to SVG library

* feat: EModelEndpoint.bedrock

* feat: first pass, bedrock chat. note: AgentClient is returning `agents` as conversation.endpoint

* fix: declare endpoint in initialization step

* chore: Update @librechat/agents dependency to version 1.4.5

* feat: backend content aggregation for agents/bedrock

* feat: abort agent requests

* feat: AWS Bedrock icons

* WIP: agent provider schema parsing

* chore: Update EditIcon props type

* refactor(useGenerationsByLatest): make agents and bedrock editable

* refactor: non-assistant message content, parts

* fix: Bedrock response `sender`

* fix: use endpointOption.model_parameters not endpointOption.modelOptions

* fix: types for step handler

* refactor: Update Agents.ToolCallDelta type

* refactor: Remove unnecessary assignment of parentMessageId in AskController

* refactor: remove unnecessary assignment of parentMessageId (agent request handler)

* fix(bedrock/agents): message regeneration

* refactor: dynamic form elements using react-hook-form Controllers

* fix: agent icons/labels for messages

* fix: agent actions

* fix: use of new dynamic tags causing application crash

* refactor: dynamic settings touch-ups

* refactor: update Slider component to allow custom track class name

* refactor: update DynamicSlider component styles

* refactor: use Constants value for GLOBAL_PROJECT_NAME (enum)

* feat: agent share global methods/controllers

* fix: agents query

* fix: `getResponseModel`

* fix: share prompt a11y issue

* refactor: update SharePrompt dialog theme styles

* refactor: explicit typing for SharePrompt

* feat: add agent roles/permissions

* chore: update @librechat/agents dependency to version 1.4.7 for tool_call_ids edge case

* fix(Anthropic): messages.X.content.Y.tool_use.input: Input should be a valid dictionary

* fix: handle text parts with tool_call_ids and empty text

* fix: role initialization

* refactor: don't make instructions required

* refactor: improve typing of Text part

* fix: setShowStopButton for agents route

* chore: remove params for now

* fix: add streamBuffer and streamRate to help prevent 'Overloaded' errors from Anthropic API

* refactor: remove console.log statement in ContentRender component

* chore: typing, rename Context to Delete Button

* chore(DeleteButton): logging

* refactor(Action): make accessible

* style(Action): improve a11y again

* refactor: remove use/mention of mongoose sessions

* feat: first pass, sharing agents

* feat: visual indicator for global agent, remove author when serving to non-author

* wip: params

* chore: fix typing issues

* fix(schemas): typing

* refactor: improve accessibility of ListCard component and fix console React warning

* wip: reset templates for non-legacy new convos

* Revert "wip: params"

This reverts commit f8067e91d4.

* Revert "refactor: dynamic form elements using react-hook-form Controllers"

This reverts commit 2150c4815d.

* fix(Parameters): types and parameter effect update to only update local state to parameters

* refactor: optimize useDebouncedInput hook for better performance

* feat: first pass, anthropic bedrock params

* chore: paramEndpoints check for endpointType too

* fix: maxTokens to use coerceNumber.optional(),

* feat: extra chat model params

* chore: reduce code repetition

* refactor: improve preset title handling in SaveAsPresetDialog component

* refactor: improve preset handling in HeaderOptions component

* chore: improve typing, replace legacy dialog for SaveAsPresetDialog

* feat: save as preset from parameters panel

* fix: multi-search in select dropdown when using Option type

* refactor: update default showDefault value to false in Dynamic components

* feat: Bedrock presets settings

* chore: config, fix agents schema, update config version

* refactor: update AWS region variable name in bedrock options endpoint to BEDROCK_AWS_DEFAULT_REGION

* refactor: update baseEndpointSchema in config.ts to include baseURL property

* refactor: update createRun function to include req parameter and set streamRate based on provider

* feat: availableRegions via config

* refactor: remove unused demo agent controller file

* WIP: title

* Update @librechat/agents to version 1.5.0

* chore: addTitle.js to handle empty responseText

* feat: support images and titles

* feat: context token updates

* Refactor BaseClient test to use expect.objectContaining

* refactor: add model select, remove header options params, move side panel params below prompts

* chore: update models list, catch title error

* feat: model service for bedrock models (env)

* chore: Remove verbose debug log in AgentClient class following stream

* feat(bedrock): track token spend; fix: token rates, value key mapping for AWS models

* refactor: handle streamRate in `handleLLMNewToken` callback

* chore: AWS Bedrock example config in `.env.example`

* refactor: Rename bedrockMeta to bedrockGeneral in settings.ts and use for AI21 and Amazon Bedrock providers

* refactor: Update `.env.example` with AWS Bedrock model IDs URL and additional notes

* feat: titleModel support for bedrock

* refactor: Update `.env.example` with additional notes for AWS Bedrock model IDs
2024-09-09 12:06:59 -04:00
Danny Avila
f742b9972e
🏷️ chore: Add Unofficial Naming Variation for Claude-3.5-Sonnet (#3800) 2024-08-27 09:07:04 -04:00
Danny Avila
a45b384bbc
💾 feat: Anthropic Prompt Caching (#3670)
* wip: initial cache control implementation, add typing for transactions handling

* feat: first pass of Anthropic Prompt Caching

* feat: standardize stream usage as pass in when calculating token counts

* feat: Add getCacheMultiplier function to calculate cache multiplier for different valueKeys and cacheTypes

* chore: imports order

* refactor: token usage recording in AnthropicClient, no need to "correct" as we have the correct amount

* feat: more accurate token counting using stream usage data

* feat: Improve token counting accuracy with stream usage data

* refactor: ensure more accurate than not token estimations if custom instructions or files are not being resent with every request

* refactor: cleanup updateUserMessageTokenCount to allow transactions to be as accurate as possible even if we shouldn't update user message token counts

* ci: fix tests
2024-08-17 03:24:09 -04:00
Danny Avila
e05a6d306d
🪙 feat: Update Token Values for gpt-4o-2024-08-06 and AWS Models (#3594)
* feat: gpt-4o-2024-08-06 pricing for tx

* feat: add AWS models to tokenValues in tx.js for pricing transactions

* feat: Update tokenValues in tx.js for AWS models pricing

* refactor: add bedrock prefix values as well (temporary until we update value keys which includes context)
2024-08-08 23:31:07 -04:00
Marco Beretta
ee4dd1b2e9
🚀 feat: gpt-4o-mini (#3384)
* feat: `gpt-4o-mini`

* feat: retrival

* fix: Update order of model token values for 'gpt-4o' and 'gpt-4o-mini'

* fix: Update order of model token values for 'gpt-4o' and 'gpt-4o-mini'

* fix: Update order of model token values for 'gpt-4o' and 'gpt-4o-mini'

* fix: add jsdoc

* fix: Update order of model token values for 'gpt-4o' and 'gpt-4o-mini'

---------

Co-authored-by: Danny Avila <danny@librechat.ai>
2024-07-19 07:59:07 -04:00
Danny Avila
54db67449a
🧠 feat: claude-3-5-sonnet (#3135)
* 🧠 feat: claude-3-5-sonnet

* chore: bump data-provider
2024-06-20 20:48:15 -04:00
Danny Avila
638ac5bba6
🚀 feat: gpt-4o (#2692)
* 🚀 feat: gpt-4o

* update readme.md

* feat: Add new test case for getMultiplier function

* feat: Refactor getMultiplier function to use valueKey variable
2024-05-13 14:25:02 -04:00
Danny Avila
3df4fac118
v0.7.1 (#2502)
* chore: make openai package definition explicit

*  v0.7.1

* chore: gpt-4-vision correct context length

* add `llava` to vision models list
2024-04-23 08:57:20 -04:00
Danny Avila
9d854dac07
🤖 feat: Gemini 1.5 Support (+Vertex AI) (#2383)
* WIP: gemini-1.5 support

* feat: extended vertex ai support

* fix: handle possibly undefined modelName

* fix: gpt-4-turbo-preview invalid vision model

* feat: specify `fileConfig.imageOutputType` and make PNG default image conversion type

* feat: better truncation for errors including base64 strings

* fix: gemini inlineData formatting

* feat: RAG augmented prompt for gemini-1.5

* feat: gemini-1.5 rates and token window

* chore: adjust tokens, update docs, update vision Models

* chore: add back `ChatGoogleVertexAI` for chat models via vertex ai

* refactor: ask/edit controllers to not use `unfinished` field for google endpoint

* chore: remove comment

* chore(ci): fix AppService test

* chore: remove comment

* refactor(GoogleSearch): use `GOOGLE_SEARCH_API_KEY` instead, issue warning for old variable

* chore: bump data-provider to 0.5.4

* chore: update docs

* fix: condition for gemini-1.5 using generative ai lib

* chore: update docs

* ci: add additional AppService test for `imageOutputType`

* refactor: optimize new config value `imageOutputType`

* chore: bump CONFIG_VERSION

* fix(assistants): avatar upload
2024-04-16 08:32:40 -04:00
Danny Avila
cd7f3a51e1
🧠 feat: Cohere support as Custom Endpoint (#2328)
* chore: bump cohere-ai, fix firebase vulnerabilities by going down versions

* feat: cohere rates and context windows

* feat(createCoherePayload): transform openai payload for cohere compatibility

* feat: cohere backend support

* refactor(UnknownIcon): optimize icon render and add cohere

* docs: add cohere to Compatible AI Endpoints

* Update ai_endpoints.md
2024-04-05 15:19:41 -04:00
Danny Avila
8263ddda3f
🤖 feat(Anthropic): Claude 3 & Vision Support (#1984)
* chore: bump anthropic SDK

* chore: update anthropic config settings (fileSupport, default models)

* feat: anthropic multi modal formatting

* refactor: update vision models and use endpoint specific max long side resizing

* feat(anthropic): multimodal messages, retry logic, and messages payload

* chore: add more safety to trimming content due to whitespace error for assistant messages

* feat(anthropic): token accounting and resending multiple images in progress

* chore: bump data-provider

* feat(anthropic): resendImages feature

* chore: optimize Edit/Ask controllers, switch model back to req model

* fix: false positive of invalid model

* refactor(validateVisionModel): use object as arg, pass in additional/available models

* refactor(validateModel): use helper function, `getModelsConfig`

* feat: add modelsConfig to endpointOption so it gets passed to all clients, use for properly validating vision models

* refactor: initialize default vision model and make sure it's available before assigning it

* refactor(useSSE): avoid resetting model if user selected a new model between request and response

* feat: show rate in transaction logging

* fix: return tokenCountMap regardless of payload shape
2024-03-06 00:04:52 -05:00
Danny Avila
8479ac7293
🚀 feat: Support for GPT-3.5 Turbo/0125 Model (#1704)
* 🚀 feat: Support for GPT-3.5 Turbo/0125 Model

* ci: fix tx test
2024-02-02 01:01:11 -05:00
Danny Avila
30e143e96d
🪙 feat: Use OpenRouter Model Data for Token Cost and Context (#1703)
* feat: use openrouter data for model token cost/context

* chore: add ttl for tokenConfig and refetch models if cache expired
2024-02-02 00:42:11 -05:00
Danny Avila
fcbaa74e4a
🚀 feat: Support for GPT-4 Turbo/0125 Models (#1643) 2024-01-25 22:57:18 -05:00
Danny Avila
583e978a82
feat(Google): Support all Text/Chat Models, Response streaming, PaLM -> Google 🤖 (#1316)
* feat: update PaLM icons

* feat: add additional google models

* POC: formatting inputs for Vertex AI streaming

* refactor: move endpoints services outside of /routes dir to /services/Endpoints

* refactor: shorten schemas import

* refactor: rename PALM to GOOGLE

* feat: make Google editable endpoint

* feat: reusable Ask and Edit controllers based off Anthropic

* chore: organize imports/logic

* fix(parseConvo): include examples in googleSchema

* fix: google only allows odd number of messages to be sent

* fix: pass proxy to AnthropicClient

* refactor: change `google` altName to `Google`

* refactor: update getModelMaxTokens and related functions to handle maxTokensMap with nested endpoint model key/values

* refactor: google Icon and response sender changes (Codey and Google logo instead of PaLM in all cases)

* feat: google support for maxTokensMap

* feat: google updated endpoints with Ask/Edit controllers, buildOptions, and initializeClient

* feat(GoogleClient): now builds prompt for text models and supports real streaming from Vertex AI through langchain

* chore(GoogleClient): remove comments, left before for reference in git history

* docs: update google instructions (WIP)

* docs(apis_and_tokens.md): add images to google instructions

* docs: remove typo apis_and_tokens.md

* Update apis_and_tokens.md

* feat(Google): use default settings map, fully support context for both text and chat models, fully support examples for chat models

* chore: update more PaLM references to Google

* chore: move playwright out of workflows to avoid failing tests
2023-12-10 14:54:13 -05:00
Danny Avila
48c087cc06
chore: add token rate support for 11/06 models (#1146)
* chore: update model rates with 11/06 rates

* chore: add new models to env.example for OPENAI_MODELS

* chore: reference actual maxTokensMap in ci tests
2023-11-06 15:26:16 -05:00
Danny Avila
599d70f1de fix(getMultiplier): correct rate for gpt-4 context 2023-10-06 14:01:08 -04:00
Danny Avila
365c39c405
feat: Accurate Token Usage Tracking & Optional Balance (#1018)
* refactor(Chains/llms): allow passing callbacks

* refactor(BaseClient): accurately count completion tokens as generation only

* refactor(OpenAIClient): remove unused getTokenCountForResponse, pass streaming var and callbacks in initializeLLM

* wip: summary prompt tokens

* refactor(summarizeMessages): new cut-off strategy that generates a better summary by adding context from beginning, truncating the middle, and providing the end
wip: draft out relevant providers and variables for token tracing

* refactor(createLLM): make streaming prop false by default

* chore: remove use of getTokenCountForResponse

* refactor(agents): use BufferMemory as ConversationSummaryBufferMemory token usage not easy to trace

* chore: remove passing of streaming prop, also console log useful vars for tracing

* feat: formatFromLangChain helper function to count tokens for ChatModelStart

* refactor(initializeLLM): add role for LLM tracing

* chore(formatFromLangChain): update JSDoc

* feat(formatMessages): formats langChain messages into OpenAI payload format

* chore: install openai-chat-tokens

* refactor(formatMessage): optimize conditional langChain logic
fix(formatFromLangChain): fix destructuring

* feat: accurate prompt tokens for ChatModelStart before generation

* refactor(handleChatModelStart): move to callbacks dir, use factory function

* refactor(initializeLLM): rename 'role' to 'context'

* feat(Balance/Transaction): new schema/models for tracking token spend
refactor(Key): factor out model export to separate file

* refactor(initializeClient): add req,res objects to client options

* feat: add-balance script to add to an existing users' token balance
refactor(Transaction): use multiplier map/function, return balance update

* refactor(Tx): update enum for tokenType, return 1 for multiplier if no map match

* refactor(Tx): add fair fallback value multiplier incase the config result is undefined

* refactor(Balance): rename 'tokens' to 'tokenCredits'

* feat: balance check, add tx.js for new tx-related methods and tests

* chore(summaryPrompts): update prompt token count

* refactor(callbacks): pass req, res
wip: check balance

* refactor(Tx): make convoId a String type, fix(calculateTokenValue)

* refactor(BaseClient): add conversationId as client prop when assigned

* feat(RunManager): track LLM runs with manager, track token spend from LLM,
refactor(OpenAIClient): use RunManager to create callbacks, pass user prop to langchain api calls

* feat(spendTokens): helper to spend prompt/completion tokens

* feat(checkBalance): add helper to check, log, deny request if balance doesn't have enough funds
refactor(Balance): static check method to return object instead of boolean now
wip(OpenAIClient): implement use of checkBalance

* refactor(initializeLLM): add token buffer to assure summary isn't generated when subsequent payload is too large
refactor(OpenAIClient): add checkBalance
refactor(createStartHandler): add checkBalance

* chore: remove prompt and completion token logging from route handler

* chore(spendTokens): add JSDoc

* feat(logTokenCost): record transactions for basic api calls

* chore(ask/edit): invoke getResponseSender only once per API call

* refactor(ask/edit): pass promptTokens to getIds and include in abort data

* refactor(getIds -> getReqData): rename function

* refactor(Tx): increase value if incomplete message

* feat: record tokenUsage when message is aborted

* refactor: subtract tokens when payload includes function_call

* refactor: add namespace for token_balance

* fix(spendTokens): only execute if corresponding token type amounts are defined

* refactor(checkBalance): throws Error if not enough token credits

* refactor(runTitleChain): pass and use signal, spread object props in create helpers, and use 'call' instead of 'run'

* fix(abortMiddleware): circular dependency, and default to empty string for completionTokens

* fix: properly cancel title requests when there isn't enough tokens to generate

* feat(predictNewSummary): custom chain for summaries to allow signal passing
refactor(summaryBuffer): use new custom chain

* feat(RunManager): add getRunByConversationId method, refactor: remove run and throw llm error on handleLLMError

* refactor(createStartHandler): if summary, add error details to runs

* fix(OpenAIClient): support aborting from summarization & showing error to user
refactor(summarizeMessages): remove unnecessary operations counting summaryPromptTokens and note for alternative, pass signal to summaryBuffer

* refactor(logTokenCost -> recordTokenUsage): rename

* refactor(checkBalance): include promptTokens in errorMessage

* refactor(checkBalance/spendTokens): move to models dir

* fix(createLanguageChain): correctly pass config

* refactor(initializeLLM/title): add tokenBuffer of 150 for balance check

* refactor(openAPIPlugin): pass signal and memory, filter functions by the one being called

* refactor(createStartHandler): add error to run if context is plugins as well

* refactor(RunManager/handleLLMError): throw error immediately if plugins, don't remove run

* refactor(PluginsClient): pass memory and signal to tools, cleanup error handling logic

* chore: use absolute equality for addTitle condition

* refactor(checkBalance): move checkBalance to execute after userMessage and tokenCounts are saved, also make conditional

* style: icon changes to match official

* fix(BaseClient): getTokenCountForResponse -> getTokenCount

* fix(formatLangChainMessages): add kwargs as fallback prop from lc_kwargs, update JSDoc

* refactor(Tx.create): does not update balance if CHECK_BALANCE is not enabled

* fix(e2e/cleanUp): cleanup new collections, import all model methods from index

* fix(config/add-balance): add uncaughtException listener

* fix: circular dependency

* refactor(initializeLLM/checkBalance): append new generations to errorMessage if cost exceeds balance

* fix(handleResponseMessage): only record token usage in this method if not error and completion is not skipped

* fix(createStartHandler): correct condition for generations

* chore: bump postcss due to moderate severity vulnerability

* chore: bump zod due to low severity vulnerability

* chore: bump openai & data-provider version

* feat(types): OpenAI Message types

* chore: update bun lockfile

* refactor(CodeBlock): add error block formatting

* refactor(utils/Plugin): factor out formatJSON and cn to separate files (json.ts and cn.ts), add extractJSON

* chore(logViolation): delete user_id after error is logged

* refactor(getMessageError -> Error): change to React.FC, add token_balance handling, use extractJSON to determine JSON instead of regex

* fix(DALL-E): use latest openai SDK

* chore: reorganize imports, fix type issue

* feat(server): add balance route

* fix(api/models): add auth

* feat(data-provider): /api/balance query

* feat: show balance if checking is enabled, refetch on final message or error

* chore: update docs, .env.example with token_usage info, add balance script command

* fix(Balance): fallback to empty obj for balance query

* style: slight adjustment of balance element

* docs(token_usage): add PR notes
2023-10-05 18:34:10 -04:00