LibreChat/docs/install/configuration/ai_setup.md
Danny Avila ecd63eb9f1
feat: Assistants API, General File Support, Side Panel, File Explorer (#1696)
* feat: assistant name/icon in Landing & Header

* feat: assistname in textarea placeholder, and use `Assistant` as default name

* feat: display non-image files in user messages

* fix: only render files if files.length is > 0

* refactor(config -> file-config): move file related configuration values to separate module, add excel types

* chore: spreadsheet file rendering

* fix(Landing): dark mode style for Assistant Name

* refactor: move progress incrementing to own hook, start smaller, cap near limit \(1\)

* refactor(useContentHandler): add empty Text part if last part was completed tool or image

* chore: add accordion trigger border styling for dark mode

* feat: Assistant Builder model selection

* chore: use Spinner when Assistant is mutating

* fix(get/assistants): return correct response object `AssistantListResponse`

* refactor(Spinner): pass size as prop

* refactor: make assistant crud mutations optimistic, add types for options

* chore: remove assistants route and view

* chore: move assistant builder components to separate directory

* feat(ContextButton): delete Assistant via context button/dialog, add localization

* refactor: conditionally show use and context menu buttons, add localization for create assistant

* feat: save side panel states to localStorage

* style(SidePanel): improve avatar menu and assistant select styling for dark mode

* refactor: make NavToggle reusable for either side (left or right), add SidePanel Toggle with ability to close it completely

* fix: resize handle and navToggle behavior

* fix(/avatar/:assistant_id): await `deleteFile` and assign unique name to uploaded image

* WIP: file UI components from PR #576

* refactor(OpenAIMinimalIcon): pass className

* feat: formatDate helper fn

* feat: DataTableColumnHeader

* feat: add row selection, formatted row values, number of rows selected

* WIP: add files to Side panel temporarily

* feat: `LB_QueueAsyncCall`: Leaky Bucket queue for external APIs, use in `processDeleteRequest`

* fix(TFile): correct `source` type with `FileSources`

* fix(useFileHandling): use `continue` instead of return when iterating multiple files, add file type to extendedFile

* chore: add generic setter type

* refactor(processDeleteRequest): settle promises to prevent rejections from processing deletions, log errors

* feat: `useFileDeletion` to reuse file deletion logic

* refactor(useFileDeletion): make `setFiles` an optional param and use object as param

* feat: useDeleteFilesFromTable

* feat: use real `files` data and add deletion action to data table

* fix(Table): make headers sticky

* feat: add dynamic filtering for columns; only show to user Host or OpenAI storage type

* style(DropdownMenu): replace `slate` with `gray`

* style(DataTable): apply dark mode themes and other misc styling

* style(Columns): add color to OpenAI Storage option

* refactor(FileContainer): make file preview reusable

* refactor(Images): make image preview reusable

* refactor(FilePreview): make file prop optional for FileIcon and FilePreview, fix relative style

* feat(Columns): add file/image previews, set a minimum size to show for file size in bytes

* WIP: File Panel with real files and formatted

* feat: open files dialog from panel

* style: file data table mobile and general column styling fixes

* refactor(api/files): return files sorted by the most recently updated

* refactor: provide fileMap through context to prevent re-selecting files to map in different areas; remove unused imports commented out in PanelColumns

* refactor(ExtendFile): make File type optional, add `attached` to prevent attached files from being deleted on remove, make Message.files a partial TFile type

* feat: attach files through file panel

* refactor(useFileHandling): move files to the start of cache list when uploaded

* refactor(useDeleteFilesMutation): delete files from cache when successfully deleted from server

* fix(FileRow): handle possible edge case of duplication due to attaching recently uploaded file

* style(SidePanel): make resize grip border transparent, remove unnecessary styling on close sidepanel button

* feat: action utilities and tests

* refactor(actions): add `ValidationResult` type and change wording for no server URL found

* refactor(actions): check for empty server URL

* fix(data-provider): revert tsconfig to fix type issue resolution

* feat(client): first pass of actions input for assistants

* refactor(FunctionSignature): change method to output object instead of string

* refactor(models/Assistant): add actions field to schema, use searchParams object for methods, and add `getAssistant`

* feat: post actions input first pass
- create new Action document
- add actions to Assistant DB document
- create /action/:assistant_id POST route
- pass more props down from PanelSwitcher, derive assistant_id from switcher
- move privacy policy to ActionInput
- reset data on input change/validation
- add `useUpdateAction`
- conform FunctionSignature type to FunctionTool
- add action, assistant doc, update hook related types

* refactor: optimize assistant/actions relationship
- past domain in metadata as hostname and not a URL
- include domain in tool name
- add `getActions` for actions retrieval by user
- add `getAssistants` for assistant docs retrieval by user
- add `assistant_id` to Action schema
- move actions to own module as a subroute to `api/assistants`
- add `useGetActionsQuery` and `useGetAssistantDocsQuery` hooks
- fix Action type def

* feat: show assistant actions in assistant builder

* feat: switch to actions on action click, editing action styling

* fix: add Assistant state for builder panel to allow immediate selection of newly created assistants as well as retaining the current assistant when switching to a different panel within the builder

* refactor(SidePanel/NavToggle): offset less from right when SidePanel is completely collapsed

* chore: rename `processActions` -> `processRequiredActions`

* chore: rename Assistant API Action to RequiredAction

* refactor(actions): avoid nesting actual API params under generic `requestBody` to optimize LLM token usage

* fix(handleTools): avoid calling `validTool` if not defined, add optional param to skip the loading of specs, which throws an error in the context of assistants

* WIP: working first pass of toolCalls generated from openapi specs

* WIP: first pass ToolCall styling

* feat: programmatic iv encryption/decryption helpers

* fix: correct ActionAuth types/enums, and define type for AuthForm

* feat: encryption/decryption helpers for Action AuthMetadata

* refactor(getActions): remove sensitive fields from query response

* refactor(POST/actions): encrypt and remove sensitive fields from mutation response

* fix(ActionService): change ESM import to CJS

* feat: frontend auth handling for actions + optimistic update on action update/creation

* refactor(actions): use the correct variables and types for setAuth method

* refactor: POST /:assistant_id action can now handle updating an existing action, add `saved_auth_fields` to determine when user explicitly saves new auth creds. only send auth metadata if user explicitly saved fields

* refactor(createActionTool): catch errors and send back meaningful error message, add flag to `getActions` to determine whether to retrieve sensitive values or not

* refactor(ToolService): add `action` property to ToolCall PartMetadata to determine if the tool call was an action, fix parsing function name issue with actionDelimiter

* fix(ActionRequest): use URL class to correctly join endpoint parts for `execute` call

* feat: delete assistant actions

* refactor: conditionally show Available actions

* refactor: show `retrieval` and `code_interpreter` as Capabilities, swap `Switch` for `Checkbox`

* chore: remove shadow-stroke from messages

* WIP: first pass of Assistants Knowledge attachments

* refactor: remove AssistantsProvider in favor of FormProvider, fix selectedAssistant re-render bug, map Assistant file_ids to files via fileMap, initialize Knowledge component with mapped files if any exist

* fix: prevent deleting files on assistant file upload

* chore: remove console.log

* refactor(useUploadFileMutation): update files and assistants cache on upload

* chore: disable oauth option as not supported yet

* feat: cancel assistant runs

* refactor: initialize OpenAI client with helper function, resolve all related circular dependencies

* fix(DALL-E): initialization

* fix(process): openai client initialization

* fix: select an existing Assistant when the active one is deleted

* chore: allow attaching files for assistant endpoint, send back relevant OpenAI error message when uploading, deconstruct openAI initialization correctly, add `message_file` to formData when a file is attached to the message but not the assistant

* fix: add assistant_id on newConvo

* fix(initializeClient): import fix

* chore: swap setAssistant for setOption in useEffect

* fix(DALL-E): add processFileURL to loadTools call

* chore: add customConfig to debug logs

* feat: delete threads on convo delete

* chore: replace Assistants icon

* chore: remove console.dir() in `abortRun`

* feat(AssistantService): accumulate text values from run in openai.responseText

* feat: titling for assistants endpoint

* chore: move panel file components to appropriate directory, add file checks for attaching files, change icon for Attach Files

* refactor: add localizations to tools, plugins, add condition for adding/remove user plugins so tool selections don't affect this value

* chore: disable `import from url` action for now

* chore: remove textMimeTypes from default fileConfig for now

* fix: catch tool errors and send as outputs with error messages

* fix: React warning about button as descendant of button

* style: retrieval and cancelled icon

* WIP: pass isSubmitting to Parts, use InProgressCall to display cancelled tool calls correctly, show domain/function name

* fix(meilisearch): fix `postSaveHook` issue where indexing expects a mongo document, and join all text content parts for meili indexing

* ci: fix dall-e tests

* ci: fix client tests

* fix: button types in actions panel

* fix: plugin auth form persisting across tool selections

* fix(ci): update AppService spec with `loadAndFormatTools`

* fix(clearConvos): add id check earlier on

* refactor(AssistantAvatar): set previewURL dynamically when emtadata.avatar changes

* feat(assistants): addTitle cache setting

* fix(useSSE): resolve rebase conflicts

* fix: delete mutation

* style(SidePanel): make grip visible on active and hover, invisible otherwise

* ci: add data-provider tests to workflow, also update eslint/tsconfig to recognize specs, and add `text/csv` to fileConfig

* fix: handle edge case where auth object is undefined, and log errors

* refactor(actions): resolve  schemas, add tests for resolving refs, import specs from separate file for tests

* chore: remove comment

* fix(ActionsInput): re-render bug when initializing states with action fields

* fix(patch/assistant): filter undefined tools

* chore: add logging for errors in assistants routes

* fix(updateAssistant): map actions to functions to avoid overwriting

* fix(actions): properly handle GET paths

* fix(convos): unhandled delete thread exception

* refactor(AssistantService): pass both thread_id and conversationId when sending intermediate assistant messages, remove `mapMessagesToSteps` from AssistantService

* refactor(useSSE): replace all messages with runMessages and pass latestMessageId to abortRun; fix(checkMessageGaps): include tool calls when  syncing messages

* refactor(assistants/chat): invoke `createOnTextProgress` after thread creation

* chore: add typing

* style: sidepanel styling

* style: action tool call domain styling

* feat(assistants): default models, limit retrieval to certain models, add env variables to to env.example

* feat: assistants api key in EndpointService

* refactor: set assistant model to conversation on assistant switch

* refactor: set assistant model to conversation on assistant select from panel

* fix(retrieveAndProcessFile): catch attempt to download file with `assistant` purpose which is not allowed; add logging

* feat: retrieval styling, handling, and logging

* chore: rename ASSISTANTS_REVERSE_PROXY to ASSISTANTS_BASE_URL

* feat: FileContext for file metadata

* feat: context file mgmt and filtering

* style(Select): hover/rounded changes

* refactor: explicit conversation switch, endpoint dependent, through `useSelectAssistant`, which does not create new chat if current endpoint is assistant endpoint

* fix(AssistantAvatar): make empty previewURL if no avatar present

* refactor: side panel mobile styling

* style: merge tool and action section, optimize mobile styling for action/tool buttons

* fix: localStorage issues

* fix(useSelectAssistant): invoke react query hook directly in select hook as Map was not being updated in time

* style: light mode fixes

* fix: prevent sidepanel nav styling from shifting layout up

* refactor: change default layout (collapsed by default)

* style: mobile optimization of DataTable

* style: datatable

* feat: client-side hide right-side panel

* chore(useNewConvo): add partial typing for preset

* fix(useSelectAssistant): pass correct model name by using template as preset

* WIP: assistant presets

* refactor(ToolService): add native solution for `TavilySearchResults` and log tool output errors

* refactor: organize imports and use native TavilySearchResults

* fix(TavilySearchResults): stringify result

* fix(ToolCall): show tool call outputs when not an action

* chore: rename Prompt Prefix to custom instructions (in user facing text only)

* refactor(EditPresetDialog): Optimize setting title by debouncing, reset preset on dialog close to avoid state mixture

* feat: add `presetOverride` to overwrite active conversation settings when saving a Preset (relevant for client side updates only)

* feat: Assistant preset settings (client-side)

* fix(Switcher): only set assistant_id and model if current endpoint is Assistants

* feat: use `useDebouncedInput` for updating conversation settings, starting with EditPresetDialog title setting and Assistant instructions setting

* feat(Assistants): add instructions field to settings

* feat(chat/assistants): pass conversation settings to run body

* wip: begin localization and only allow actions if the assistant is created

* refactor(AssistantsPanel): knowledge localization, allow tools on creation

* feat: experimental: allow 'priming' values before assistant is created, that would normally require an assistant_id to be defined

* chore: trim console logs and make more meaningful

* chore: toast messages

* fix(ci): date test

* feat: create file when uploading Assistant Avatar

* feat: file upload rate limiting from custom config with dynamic file route initialization

* refactor: use file upload limiters on post routes only

* refactor(fileConfig): add endpoints field for endpoint specific fileconfigs, add mergeConfig function, add tests

* refactor: fileConfig route, dynamic multer instances used on all '/' and '/images' POST routes, data service and query hook

* feat: supportedMimeTypesSchema, test for array of regex

* feat: configurable file config limits

* chore: clarify assistants file knowledge prereq.

* chore(useTextarea): default to localized 'Assistant' if assistant name is empty

* feat: configurable file limits and toggle file upload per endpoint

* fix(useUploadFileMutation): prevent updating assistant.files cache if file upload is a message_file attachment

* fix(AssistantSelect): set last selected assistant only when timeout successfully runs

* refactor(queries): disable assistant queries if assistants endpoint is not enabled

* chore(Switcher): add localization

* chore: pluralize `assistant` for `EModelEndpoint key and value

* feat: show/hide assistant UI components based on endpoint availability; librechat.yaml config for disabling builder section and setting polling/timeout intervals

* fix(compactEndpointSchemas): use EModelEndpoint for schema access

* feat(runAssistant): use configured values from `librechat.yaml` for `pollIntervalMs` and `timeout`

* fix: naming issue

* wip: revert landing

* 🎉 happy birthday LibreChat (#1768)

* happy birthday LibreChat

* Refactor endpoint condition in Landing component

* Update birthday message in Eng.tsx

* fix(/config): avoid nesting ternaries

* refactor(/config): check birthday

---------

Co-authored-by: Danny Avila <messagedaniel@protonmail.com>

* fix: landing

* fix: landing

* fix(useMessageHelpers): hardcoded check to use EModelEndpoint instead

* fix(ci): convo test revert to main

* fix(assistants/chat): fix issue where assistant_id was being saved as model for convo

* chore: added logging, promises racing to prevent longer timeouts, explicit setting of maxRetries and timeouts, robust catching of invalid abortRun params

* refactor: use recoil state for `showStopButton` and only show for assistants endpoint after syncing conversation data

* refactor: optimize abortRun strategy using localStorage, refactor `abortConversation` to use async/await and await the result, refactor how the abortKey cache is set for runs

* fix(checkMessageGaps): assign `assistant_id` to synced messages if defined; prevents UI from showing blank assistant for cancelled messages

* refactor: re-order sequence of chat route, only allow aborting messages after run is created, cancel abortRun if there was a cancelling error (likely due already cancelled in chat route), and add extra logging

* chore(typedefs): add httpAgent type to OpenAIClient

* refactor: use custom implementation of retrieving run with axios to allow for timing out run query

* fix(waitForRun): handle timed out run retrieval query

* refactor: update preset conditions:
- presets will retain settings when a different endpoint is selected; for existing convos, either when modular or is assistant switch
- no longer use `navigateToConvo` on preset select

* fix: temporary calculator hack as expects string input when invoked

* fix: cancel abortRun only when cancelling error is a result of the run already being cancelled

* chore: remove use of `fileMaxSizeMB` and total counterpart (redundant)

* docs: custom config documentation update

* docs: assistants api setup and dotenv, new custom config fields

* refactor(Switcher): make Assistant switcher sticky in SidePanel

* chore(useSSE): remove console log of data and message index

* refactor(AssistantPanel): button styling and add secondary select button to bottom of panel

* refactor(OpenAIClient): allow passing conversationId to RunManager through titleConvo and initializeLLM to properly record title context tokens used in cases where conversationId was not defined by the client

* feat(assistants): token tracking for assistant runs

* chore(spendTokens): improve logging

* feat: support/exclude specific assistant Ids

* chore: add update `librechat.example.yaml`, optimize `AppService` handling, new tests for `AppService`, optimize missing/outdate config logging

* chore: mount docker logs to root of project

* chore: condense axios errors

* chore: bump vite

* chore: vite hot reload fix using latest version

* chore(getOpenAIModels): sort instruct models to the end of models list

* fix(assistants): user provided key

* fix(assistants): user provided key, invalidate more queries on revoke

---------

Co-authored-by: Marco Beretta <81851188+Berry-13@users.noreply.github.com>
2024-02-13 20:42:27 -05:00

31 KiB

title description weight
🤖 AI Setup This doc explains how to setup your AI providers, their APIs and credentials. -8

AI Setup

This doc explains how to setup your AI providers, their APIs and credentials.

"Endpoints" refer to the AI provider, configuration or API to use, which determines what models and settings are available for the current chat request.

For example, OpenAI, Google, Plugins, Azure OpenAI, Anthropic, are all different "endpoints". Since OpenAI was the first supported endpoint, it's listed first by default.

Using the default environment values from /.env.example will enable several endpoints, with credentials to be provided on a per-user basis from the web app. Alternatively, you can provide credentials for all users of your instance.

This guide will walk you through setting up each Endpoint as needed.

For custom endpoint configuration, such as adding Mistral AI or Openrouter refer to the librechat.yaml configuration guide.

Reminder: If you use docker, you should rebuild the docker image (here's how) each time you update your credentials

Note: Configuring pre-made Endpoint/model/conversation settings as singular options for your users is a planned feature. See the related discussion here: System-wide custom model settings (lightweight GPTs) #1291

General

Free AI APIs

Setting a Default Endpoint

In the case where you have multiple endpoints setup, but want a specific one to be first in the order, you need to set the following environment variable.

# .env file
# No spaces between values
ENDPOINTS=azureOpenAI,openAI,google 

Note that LibreChat will use your last selected endpoint when creating a new conversation. So if Azure OpenAI is first in the order, but you used or view an OpenAI conversation last, when you hit "New Chat," OpenAI will be selected with its default conversation settings.

To override this behavior, you need a preset and you need to set that specific preset as the default one to use on every new chat.

Setting a Default Preset

See the Presets Guide for more details

A preset refers to a specific Endpoint/Model/Conversation Settings that you can save.

The default preset will always be used when creating a new conversation.

Here's a video to demonstrate: Setting a Default Preset


OpenAI

To get your OpenAI API key, you need to:

  • Go to https://platform.openai.com/account/api-keys
  • Create an account or log in with your existing one
  • Add a payment method to your account (this is not free, sorry 😬)
  • Copy your secret key (sk-...) and save it in ./.env as OPENAI_API_KEY

Notes:

  • Selecting a vision model for messages with attachments is not necessary as it will be switched behind the scenes for you. If you didn't outright select a vision model, it will only be used for the vision request and you should still see the non-vision model you had selected after the request is successful
  • OpenAI Vision models allow for messages without attachments

Assistants

ASSISTANTS_API_KEY=your-key
  • You can determine which models you would like to have available with ASSISTANTS_MODELS; otherwise, the models list fetched from OpenAI will be used (only Assistants API compatible models will be shown).
# without spaces
ASSISTANTS_MODELS=gpt-3.5-turbo-0125,gpt-3.5-turbo-16k-0613,gpt-3.5-turbo-16k,gpt-3.5-turbo,gpt-4,gpt-4-0314,gpt-4-32k-0314,gpt-4-0613,gpt-3.5-turbo-0613,gpt-3.5-turbo-1106,gpt-4-0125-preview,gpt-4-turbo-preview,gpt-4-1106-preview
  • If necessary, you can also set an alternate base URL instead of the official one with ASSISTANTS_BASE_URL, which is similar to the OpenAI counterpart OPENAI_REVERSE_PROXY
ASSISTANTS_BASE_URL=http://your-alt-baseURL:3080/
  • There is additional, optional configuration, depending on your needs, such as disabling the assistant builder UI, that are available via the librechat.yaml custom config file:
    • Control the visibility and use of the builder interface for assistants. More info
    • Specify the polling interval in milliseconds for checking run updates or changes in assistant run states. More info
    • Set the timeout period in milliseconds for assistant runs. Helps manage system load by limiting total run operation time. More info
    • Specify which assistant Ids are supported or excluded More info

Notes:

  • At the time of writing, only the following models support the Retrieval capability:
    • gpt-3.5-turbo-0125
    • gpt-4-0125-preview
    • gpt-4-turbo-preview
    • gpt-4-1106-preview
    • gpt-3.5-turbo-1106
  • Vision capability is not yet supported.

Anthropic


Google

For the Google Endpoint, you can either use the Generative Language API (for Gemini models), or the Vertex AI API (for PaLM2 & Codey models, Gemini support coming soon).

The Generative Language API uses an API key, which you can get from Google AI Studio.

For Vertex AI, you need a Service Account JSON key file, with appropriate access configured.

Instructions for both are given below.

Generative Language API (Gemini)

60 Gemini requests/minute are currently free until early next year when it enters general availability.

⚠️ Google will be using that free input/output to help improve the model, with data de-identified from your Google Account and API key. ⚠️ During this period, your messages “may be accessible to trained reviewers.”

To use Gemini models, you'll need an API key. If you don't already have one, create a key in Google AI Studio.

Get an API key here: makersuite.google.com

Once you have your key, provide the key in your .env file, which allows all users of your instance to use it.

GOOGLE_KEY=mY_SeCreT_w9347w8_kEY

Or, you can make users provide it from the frontend by setting the following:

GOOGLE_KEY=user_provided

Notes:

  • PaLM2 and Codey models cannot be accessed through the Generative Language API, only through Vertex AI.
  • Selecting gemini-pro-vision for messages with attachments is not necessary as it will be switched behind the scenes for you
  • Since gemini-pro-visiondoes not accept non-attachment messages, messages without attachments are automatically switched to use gemini-pro (otherwise, Google responds with an error)

Setting GOOGLE_KEY=user_provided in your .env file will configure both the Vertex AI Service Account JSON key file and the Generative Language API key to be provided from the frontend like so:

image

Vertex AI (PaLM 2 & Codey)

To setup Google LLMs (via Google Cloud Vertex AI), first, signup for Google Cloud: cloud.google.com

You can usually get $300 starting credit, which makes this option free for 90 days.

1. Once signed up, Enable the Vertex AI API on Google Cloud:

2. Create a Service Account with Vertex AI role:

  • Click here to create a Service Account
  • Select or create a project
  • Enter a service account ID (required), name and description are optional

    • image
  • Click on "Create and Continue" to give at least the "Vertex AI User" role

    • image
  • Click on "Continue/Done"

3. Create a JSON key to Save in your Project Directory:

  • Go back to the Service Accounts page
  • Select your service account
  • Click on "Keys"

    • image
  • Click on "Add Key" and then "Create new key"

    • image
  • Choose JSON as the key type and click on "Create"
  • Download the key file and rename it as 'auth.json'
  • Save it within the project directory, in /api/data/
    • image

Saving your JSON key file in the project directory which allows all users of your LibreChat instance to use it.

Alternatively, you can make users provide it from the frontend by setting the following:

# Note: this configures both the Vertex AI Service Account JSON key file
# and the Generative Language API key to be provided from the frontend.
GOOGLE_KEY=user_provided

Note: Using Gemini models through Vertex AI is possible but not yet supported.


Azure OpenAI

In order to use Azure OpenAI with this project, specific environment variables must be set in your .env file. These variables will be used for constructing the API URLs.

The variables needed are outlined below:

Required Variables

These variables construct the API URL for Azure OpenAI.

  • AZURE_API_KEY: Your Azure OpenAI API key.
  • AZURE_OPENAI_API_INSTANCE_NAME: The instance name of your Azure OpenAI API.
  • AZURE_OPENAI_API_DEPLOYMENT_NAME: The deployment name of your Azure OpenAI API.
  • AZURE_OPENAI_API_VERSION: The version of your Azure OpenAI API.

For example, with these variables, the URL for chat completion would look something like:

https://{AZURE_OPENAI_API_INSTANCE_NAME}.openai.azure.com/openai/deployments/{AZURE_OPENAI_API_DEPLOYMENT_NAME}/chat/completions?api-version={AZURE_OPENAI_API_VERSION}

You should also consider changing the AZURE_OPENAI_MODELS variable to the models available in your deployment.

# .env file
AZURE_OPENAI_MODELS=gpt-4-1106-preview,gpt-4,gpt-3.5-turbo,gpt-3.5-turbo-1106,gpt-4-vision-preview

Overriding the construction of the API URL will be possible but is not yet implemented. Follow progress on this feature here: Issue #1266

Model Deployments

Note: a change will be developed to improve current configuration settings, to allow multiple deployments/model configurations setup with ease: #1390

As of 2023-12-18, the Azure API allows only one model per deployment.

It's highly recommended to name your deployments after the model name (e.g., "gpt-3.5-turbo") for easy deployment switching.

When you do so, LibreChat will correctly switch the deployment, while associating the correct max context per model, if you have the following environment variable set:

AZURE_USE_MODEL_AS_DEPLOYMENT_NAME=TRUE

For example, when you have set AZURE_USE_MODEL_AS_DEPLOYMENT_NAME=TRUE, the following deployment configuration provides the most seamless, error-free experience for LibreChat, including Vision support and tracking the correct max context tokens:

Screenshot 2023-12-18 111742

Alternatively, you can use custom deployment names and set AZURE_OPENAI_DEFAULT_MODEL for expected functionality.

  • AZURE_OPENAI_MODELS: List the available models, separated by commas without spaces. The first listed model will be the default. If left blank, internal settings will be used. Note that deployment names can't have periods, which are removed when generating the endpoint.

Example use:

# .env file
AZURE_OPENAI_MODELS=gpt-3.5-turbo,gpt-4,gpt-5

  • AZURE_USE_MODEL_AS_DEPLOYMENT_NAME: Enable using the model name as the deployment name for the API URL.

Example use:

# .env file
AZURE_USE_MODEL_AS_DEPLOYMENT_NAME=TRUE

Setting a Default Model for Azure

This section is relevant when you are not naming deployments after model names as shown above.

Important: The Azure OpenAI API does not use the model field in the payload but is a necessary identifier for LibreChat. If your deployment names do not correspond to the model names, and you're having issues with the model not being recognized, you should set this field to explicitly tell LibreChat to treat your Azure OpenAI API requests as if the specified model was selected.

If AZURE_USE_MODEL_AS_DEPLOYMENT_NAME is enabled, the model you set with AZURE_OPENAI_DEFAULT_MODEL will not be recognized and will not be used as the deployment name; instead, it will use the model selected by the user as the "deployment" name.

  • AZURE_OPENAI_DEFAULT_MODEL: Override the model setting for Azure, useful if using custom deployment names.

Example use:

# .env file
# MUST be a real OpenAI model, named exactly how it is recognized by OpenAI API (not Azure)
AZURE_OPENAI_DEFAULT_MODEL=gpt-3.5-turbo # do include periods in the model name here

Using a Specified Base URL with Azure

The base URL for Azure OpenAI API requests can be dynamically configured. This is useful for proxying services such as Cloudflare AI Gateway, or if you wish to explicitly override the baseURL handling of the app.

LibreChat will use the AZURE_OPENAI_BASEURL environment variable, which can include placeholders for the Azure OpenAI API instance and deployment names.

In the application's environment configuration, the base URL is set like this:

# .env file
AZURE_OPENAI_BASEURL=https://example.azure-api.net/${INSTANCE_NAME}/${DEPLOYMENT_NAME}

# OR
AZURE_OPENAI_BASEURL=https://${INSTANCE_NAME}.openai.azure.com/openai/deployments/${DEPLOYMENT_NAME}

# Cloudflare example
AZURE_OPENAI_BASEURL=https://gateway.ai.cloudflare.com/v1/ACCOUNT_TAG/GATEWAY/azure-openai/${INSTANCE_NAME}/${DEPLOYMENT_NAME}

The application replaces ${INSTANCE_NAME} and ${DEPLOYMENT_NAME} in the AZURE_OPENAI_BASEURL, processed according to the other settings discussed in the guide.

You can also omit the placeholders completely and simply construct the baseURL with your credentials:

# .env file
AZURE_OPENAI_BASEURL=https://instance-1.openai.azure.com/openai/deployments/deployment-1

# Cloudflare example
AZURE_OPENAI_BASEURL=https://gateway.ai.cloudflare.com/v1/ACCOUNT_TAG/GATEWAY/azure-openai/instance-1/deployment-1

Setting these values will override all of the application's internal handling of the instance and deployment names and use your specified base URL.

Notes:

  • You should still provide the AZURE_OPENAI_API_VERSION and AZURE_API_KEY via the .env file as they are programmatically added to the requests.
  • When specifying instance and deployment names in the AZURE_OPENAI_BASEURL, their respective environment variables can be omitted (AZURE_OPENAI_API_INSTANCE_NAME and AZURE_OPENAI_API_DEPLOYMENT_NAME) except for use with Plugins.
  • Specifying instance and deployment names in the AZURE_OPENAI_BASEURL instead of placeholders creates conflicts with "plugins," "vision," "default-model," and "model-as-deployment-name" support.
  • Due to the conflicts that arise with other features, it is recommended to use placeholder for instance and deployment names in the AZURE_OPENAI_BASEURL

Enabling Auto-Generated Titles with Azure

The default titling model is set to gpt-3.5-turbo.

If you're using AZURE_USE_MODEL_AS_DEPLOYMENT_NAME and have "gpt-35-turbo" setup as a deployment name, this should work out-of-the-box.

In any case, you can adjust the title model as such: OPENAI_TITLE_MODEL=your-title-model

Using GPT-4 Vision with Azure

Currently, the best way to setup Vision is to use your deployment names as the model names, as shown here

This will work seamlessly as it does with the OpenAI endpoint (no need to select the vision model, it will be switched behind the scenes)

Alternatively, you can set the required variables to explicitly use your vision deployment, but this may limit you to exclusively using your vision deployment for all Azure chat settings.

Notes:

  • If using AZURE_OPENAI_BASEURL, you should not specify instance and deployment names instead of placeholders as the vision request will fail.
  • As of December 18th, 2023, Vision models seem to have degraded performance with Azure OpenAI when compared to OpenAI

image

Note: a change will be developed to improve current configuration settings, to allow multiple deployments/model configurations setup with ease: #1390

Generate images with Azure OpenAI Service (DALL-E)

Model ID Feature Availability Max Request (characters)
dalle2 East US 1000
dalle3 Sweden Central 4000
  • First you need to create an Azure resource that hosts DALL-E
    • At the time of writing, dall-e-3 is available in the SwedenCentral region, dall-e-2 in the EastUS region.
  • Then, you need to deploy the image generation model in one of the above regions.
  • Configure your environment variables based on Azure credentials:

- For DALL-E-3:

DALLE3_AZURE_API_VERSION=the-api-version # e.g.: 2023-12-01-preview
DALLE3_BASEURL=https://<AZURE_OPENAI_API_INSTANCE_NAME>.openai.azure.com/openai/deployments/<DALLE3_DEPLOYMENT_NAME>/
DALLE3_API_KEY=your-azure-api-key-for-dall-e-3

- For DALL-E-2:

DALLE2_AZURE_API_VERSION=the-api-version # e.g.: 2023-12-01-preview
DALLE2_BASEURL=https://<AZURE_OPENAI_API_INSTANCE_NAME>.openai.azure.com/openai/deployments/<DALLE2_DEPLOYMENT_NAME>/
DALLE2_API_KEY=your-azure-api-key-for-dall-e-2

DALL-E Notes:

  • For DALL-E-3, the default system prompt has the LLM prefer the "vivid" style parameter, which seems to be the preferred setting for ChatGPT as "natural" can sometimes produce lackluster results.
  • See official prompt for reference: DALL-E System Prompt
  • You can adjust the system prompts to your liking:
DALLE3_SYSTEM_PROMPT="Your DALL-E-3 System Prompt here"
DALLE2_SYSTEM_PROMPT="Your DALL-E-2 System Prompt here"
  • The DALLE_REVERSE_PROXY environment variable is ignored when Azure credentials (DALLEx_AZURE_API_VERSION and DALLEx_BASEURL) for DALL-E are configured.

Optional Variables

These variables are currently not used by LibreChat

  • AZURE_OPENAI_API_COMPLETIONS_DEPLOYMENT_NAME: The deployment name for completion. This is currently not in use but may be used in future.
  • AZURE_OPENAI_API_EMBEDDINGS_DEPLOYMENT_NAME: The deployment name for embedding. This is currently not in use but may be used in future.

These two variables are optional but may be used in future updates of this project.

Using Plugins with Azure

Note: To use the Plugins endpoint with Azure OpenAI, you need a deployment supporting function calling. Otherwise, you need to set "Functions" off in the Agent settings. When you are not using "functions" mode, it's recommend to have "skip completion" off as well, which is a review step of what the agent generated.

To use Azure with the Plugins endpoint, make sure the following environment variables are set:

  • PLUGINS_USE_AZURE: If set to "true" or any truthy value, this will enable the program to use Azure with the Plugins endpoint.
  • AZURE_API_KEY: Your Azure API key must be set with an environment variable.

Important:

  • If using AZURE_OPENAI_BASEURL, you should not specify instance and deployment names instead of placeholders as the plugin request will fail.

OpenRouter

OpenRouter is a legitimate proxy service to a multitude of LLMs, both closed and open source, including:

  • OpenAI models (great if you are barred from their API for whatever reason)
  • Anthropic Claude models (same as above)
  • Meta's Llama models
  • pygmalionai/mythalion-13b
  • and many more open source models. Newer integrations are usually discounted, too!

See their available models and pricing here: Supported Models

OpenRouter is integrated to the LibreChat by overriding the OpenAI endpoint.

Important: As of v0.6.6, you can use OpenRouter as its own standalone endpoint:

Review the Custom Config Guide (click here) to add an OpenRouter Endpoint

image

Setup (legacy):

  • Signup to OpenRouter and create a key. You should name it and set a limit as well.
  • Set the environment variable OPENROUTER_API_KEY in your .env file to the key you just created.
  • Set something in the OPENAI_API_KEY, it can be anyting, but do not leave it blank or set to user_provided
  • Restart your LibreChat server and use the OpenAI or Plugins endpoints.

Notes (legacy):

  • This will override the official OpenAI API or your reverse proxy settings for both Plugins and OpenAI.
  • On initial setup, you may need to refresh your page twice to see all their supported models populate automatically.
  • Plugins: Functions Agent works with OpenRouter when using OpenAI models.
  • Plugins: Turn functions off to try plugins with non-OpenAI models (ChatGPT plugins will not work and others may not work as expected).
  • Plugins: Make sure PLUGINS_USE_AZURE is not set in your .env file when wanting to use OpenRouter and you have Azure configured.

Unofficial APIs

Important: Stability for Unofficial APIs are not guaranteed. Access methods to these APIs are hacky, prone to errors, and patching, and are marked lowest in priority in LibreChat's development.

ChatGPTBrowser

Backend Access to https://chat.openai.com/api

This is not to be confused with OpenAI's Official API!

Note that this is disabled by default and requires additional configuration to work. Also, using this may have your data exposed to 3rd parties if using a proxy, and OpenAI may flag your account. See: ChatGPT Reverse Proxy

To get your Access token for ChatGPT Browser Access, you need to:

Warning: There may be a chance of your account being banned if you deploy the app to multiple users with this method. Use at your own risk.


BingAI

I recommend using Microsoft Edge for this:

  • Navigate to Bing Chat
  • Login if you haven't already
  • Initiate a conversation with Bing
  • Open Dev Tools, usually with F12 or Ctrl + Shift + C
  • Navigate to the Network tab
  • Look for lsp.asx (if it's not there look into the other entries for one with a very long cookie)
  • Copy the whole cookie value. (Yes it's very long 😉)
  • Use this "full cookie string" for your "BingAI Token"

copilot-gpt4-service

For this setup, an additional docker container will need to be setup.

It is necessary to obtain your token first.

Follow these instructions provided at copilot-gpt4-service#obtaining-token and keep your token for use within the service. Additionally, more detailed instructions for setting copilot-gpt4-service are available at the GitHub repo.

It is not recommended to use the copilot token obtained directly, instead use the SUPER_TOKEN variable. (You can generate your own SUPER_TOKEN with the OpenSSL command openssl rand -hex 16 and set the ENABLE_SUPER_TOKEN variable to true)

  1. Once your Docker environment is ready and your tokens are generated, proceed with this Docker run command to start the service:

    docker run -d \
      --name copilot-gpt4-service \
      -e HOST=0.0.0.0 \
      -e COPILOT_TOKEN=ghp_xxxxxxx \
      -e SUPER_TOKEN=your_super_token \
      -e ENABLE_SUPER_TOKEN=true \
      --restart always \
      -p 8080:8080 \
      aaamoon/copilot-gpt4-service:latest
    
  2. For Docker Compose users, use the equivalent yaml configuration provided below:

    version: '3.8'
    services:
      copilot-gpt4-service:
        image: aaamoon/copilot-gpt4-service:latest
        environment:
          - HOST=0.0.0.0
          - COPILOT_TOKEN=ghp_xxxxxxx # Default GitHub Copilot Token, if this item is set, the Token carried with the request will be ignored. Default is empty.
          - SUPER_TOKEN=your_super_token # Super Token is a user-defined standalone token that can access COPILOT_TOKEN above. This allows you to share the service without exposing your COPILOT_TOKEN. Multiple tokens are separated by commas. Default is empty.
          - ENABLE_SUPER_TOKEN=true # Whether to enable SUPER_TOKEN, default is false. If false, but COPILOT_TOKEN is not empty, COPILOT_TOKEN will be used without any authentication for all requests.
        ports:
          - 8080:8080
        restart: unless-stopped
        container_name: copilot-gpt4-service
    
  3. After setting up the Docker container for copilot-gpt4-service, you can add it to your librechat.yaml configuration. Here is an example configuration:

    version: 1.0.1
    cache: true
    endpoints:
      custom:
        - name: "OpenAI via Copilot"
          apiKey: "your_super_token"
          baseURL: "http://[copilotgpt4service_host_ip]:8080/v1"
          models:
            default: ["gpt-4", "gpt-3.5-turbo"] # *See Notes
          titleConvo: true
          titleModel: "gpt-3.5-turbo"
          summarize: true
          summaryModel: "gpt-3.5-turbo"
          forcePrompt: false
          modelDisplayLabel: "OpenAI"
          dropParams: ["user"]
    

    Replace your_super_token with the token you obtained following the instructions highlighted above and [copilotgpt4service_host_ip] with the IP of your Docker host. **See Notes

    Restart Librechat after adding the needed configuration, and select OpenAI via Copilot to start using!

    Notes:

    • *Only allowed models are gpt-4 and gpt-3.5-turbo.
    • **Advanced users can add this to their existing docker-compose file/existing docker network and avoid having to expose port 8080 (or any port) to the copilot-gpt4-service container.

Conclusion

That's it! You're all set. 🎉


⚠️ Note: If you're having trouble, before creating a new issue, please search for similar ones on our #issues thread on our discord or our troubleshooting discussion on our Discussions page. If you don't find a relevant issue, feel free to create a new one and provide as much detail as possible.