LibreChat/docs/install/configuration/litellm.md

---
title: 🚅 LiteLLM
description: Using LibreChat with LiteLLM Proxy 
weight: -7
---

# Using LibreChat with LiteLLM Proxy 
Use **[LiteLLM Proxy](https://docs.litellm.ai/docs/simple_proxy)** for: 
* Calling 100+ LLMs Huggingface/Bedrock/TogetherAI/etc. in the OpenAI ChatCompletions & Completions format
* Load balancing - between Multiple Models + Deployments of the same model LiteLLM proxy can handle 1k+ requests/second during load tests
* Authentication & Spend Tracking Virtual Keys

## Start LiteLLM Proxy Server 
### Pip install litellm 
```shell
pip install litellm
```

### Create a config.yaml for litellm proxy 
More information on LiteLLM configurations here: **[docs.litellm.ai/docs/simple_proxy](https://docs.litellm.ai/docs/simple_proxy)**

```yaml
model_list:
  - model_name: gpt-3.5-turbo
    litellm_params:
      model: azure/gpt-turbo-small-eu
      api_base: https://my-endpoint-europe-berri-992.openai.azure.com/
      api_key: 
      rpm: 6      # Rate limit for this deployment: in requests per minute (rpm)
  - model_name: gpt-3.5-turbo
    litellm_params:
      model: azure/gpt-turbo-small-ca
      api_base: https://my-endpoint-canada-berri992.openai.azure.com/
      api_key: 
      rpm: 6
  - model_name: gpt-3.5-turbo
    litellm_params:
      model: azure/gpt-turbo-large
      api_base: https://openai-france-1234.openai.azure.com/
      api_key: 
      rpm: 1440
```

### Start the proxy
```shell
litellm --config /path/to/config.yaml

#INFO: Proxy running on http://0.0.0.0:8000
```

## Use LiteLLM Proxy Server with LibreChat


#### 1. Clone the repo
```shell
git clone https://github.com/danny-avila/LibreChat.git
```


#### 2. Modify Librechat's `docker-compose.yml`
```yaml
OPENAI_REVERSE_PROXY=http://host.docker.internal:8000/v1/chat/completions
```

**Important**: As of v0.6.6, it's recommend you use the `librechat.yaml` [Configuration file (guide here)](./custom_config.md) to add Reverse Proxies as separate endpoints.

#### 3. Save fake OpenAI key in Librechat's `.env` 

Copy Librechat's `.env.example` to `.env` and overwrite the default OPENAI_API_KEY (by default it requires the user to pass a key).
```env
OPENAI_API_KEY=sk-1234
```

#### 4. Run LibreChat: 
```shell
docker compose up
```

---

### Why use LiteLLM?

1. **Access to Multiple LLMs**: It allows calling over 100 LLMs from platforms like Huggingface, Bedrock, TogetherAI, etc., using OpenAI's ChatCompletions and Completions format.

2. **Load Balancing**: Capable of handling over 1,000 requests per second during load tests, it balances load across various models and deployments.

3. **Authentication & Spend Tracking**: The server supports virtual keys for authentication and tracks spending.

Key components and features include:

- **Installation**: Easy installation.
- **Testing**: Testing features to route requests to specific models.
- **Server Endpoints**: Offers multiple endpoints for chat completions, completions, embeddings, model lists, and key generation.
- **Supported LLMs**: Supports a wide range of LLMs, including AWS Bedrock, Azure OpenAI, Huggingface, AWS Sagemaker, Anthropic, and more.
- **Proxy Configurations**: Allows setting various parameters like model list, server settings, environment variables, and more.
- **Multiple Models Management**: Configurations can be set up for managing multiple models with fallbacks, cooldowns, retries, and timeouts.
- **Embedding Models Support**: Special configurations for embedding models.
- **Authentication Management**: Features for managing authentication through virtual keys, model upgrades/downgrades, and tracking spend.
- **Custom Configurations**: Supports setting model-specific parameters, caching responses, and custom prompt templates.
- **Debugging Tools**: Options for debugging and logging proxy input/output.
- **Deployment and Performance**: Information on deploying LiteLLM Proxy and its performance metrics.
- **Proxy CLI Arguments**: A wide range of command-line arguments for customization.

Overall, LiteLLM Server offers a comprehensive suite of tools for managing, deploying, and interacting with a variety of LLMs, making it a versatile choice for large-scale AI applications.
🧹📚 docs: refactor and clean up (#1392) * 📑 update mkdocs * rename docker override file and add to gitignore * update .env.example - GOOGLE_MODELS * update index.md * doc refactor: split installation and configuration in two sub-folders * doc update: installation guides * doc update: configuration guides * doc: new docker override guide * doc: new beginner's guide for contributions - Thanks @Berry-13 * doc: update documentation_guidelines.md * doc: update testing.md * doc: update deployment guides * doc: update /dev readme * doc: update general_info * doc: add 0 value to doc weight * doc: add index.md to every doc folders * doc: add weight to index.md and move openrouter from free_ai_apis.md to ai_setup.md * doc: update toc so they display properly on the right had side in mkdocs * doc: update pandoranext.md * doc: index logging_system.md * doc: update readme.md * doc: update litellm.md * doc: update ./dev/readme.md * doc:🔖 new presets.md * doc: minor corrections * doc update: user_auth_system.md and presets.md, doc feat: add mermaid support to mkdocs * doc update: add screenshots to presets.md * doc update: add screenshots to - OpenID with AWS Cognito * doc update: BingAI cookie instruction * doc update: discord auth * doc update: facebook auth * doc: corrections to user_auth_system.md * doc update: github auth * doc update: google auth * doc update: auth clean up * doc organization: installation * doc organization: configuration * doc organization: features+plugins & update:plugins screenshots * doc organization: deploymend + general_info & update: tech_stack.md * doc organization: contributions * doc: minor fixes * doc: minor fixes 2023-12-22 08:36:42 -05:00			`---`
			`title: 🚅 LiteLLM`
🪪mkdocs: social cards (#1428) * mkdocs plugins: add plugin for social cards and plugin that allow to exclude a folder * docs: fix hyperlinks * mkdocs: social cards (descriptions) for 'contributions' and 'deployment' guides * mkdocs: social cards (descriptions) for all 'index.md' * mkdocs: social cards (descriptions) for 'features' and 'plugins' * mkdocs: social cards (descriptions) for 'general_info' * mkdocs: social cards (descriptions) for 'configuration' * mkdocs: social cards (descriptions) for 'installation' * mkdocs: minor fixes * update librechat.svg * update how_to_contribute.md add reference to the official GitHub documentation 2023-12-28 17:10:06 -05:00			`description: Using LibreChat with LiteLLM Proxy`
🧹📚 docs: refactor and clean up (#1392) * 📑 update mkdocs * rename docker override file and add to gitignore * update .env.example - GOOGLE_MODELS * update index.md * doc refactor: split installation and configuration in two sub-folders * doc update: installation guides * doc update: configuration guides * doc: new docker override guide * doc: new beginner's guide for contributions - Thanks @Berry-13 * doc: update documentation_guidelines.md * doc: update testing.md * doc: update deployment guides * doc: update /dev readme * doc: update general_info * doc: add 0 value to doc weight * doc: add index.md to every doc folders * doc: add weight to index.md and move openrouter from free_ai_apis.md to ai_setup.md * doc: update toc so they display properly on the right had side in mkdocs * doc: update pandoranext.md * doc: index logging_system.md * doc: update readme.md * doc: update litellm.md * doc: update ./dev/readme.md * doc:🔖 new presets.md * doc: minor corrections * doc update: user_auth_system.md and presets.md, doc feat: add mermaid support to mkdocs * doc update: add screenshots to presets.md * doc update: add screenshots to - OpenID with AWS Cognito * doc update: BingAI cookie instruction * doc update: discord auth * doc update: facebook auth * doc: corrections to user_auth_system.md * doc update: github auth * doc update: google auth * doc update: auth clean up * doc organization: installation * doc organization: configuration * doc organization: features+plugins & update:plugins screenshots * doc organization: deploymend + general_info & update: tech_stack.md * doc organization: contributions * doc: minor fixes * doc: minor fixes 2023-12-22 08:36:42 -05:00			`weight: -7`
			`---`

📚 docs: Add LiteLLM Proxy - Load balance 100+ LLMs & Spend Tracking ⚖️🤖📈 (#1249) * (docs) add instructions on using litellm * Update litellm.md --------- Co-authored-by: Danny Avila <110412045+danny-avila@users.noreply.github.com> 2023-11-30 10:59:16 -08:00			`# Using LibreChat with LiteLLM Proxy`
🪪mkdocs: social cards (#1428) * mkdocs plugins: add plugin for social cards and plugin that allow to exclude a folder * docs: fix hyperlinks * mkdocs: social cards (descriptions) for 'contributions' and 'deployment' guides * mkdocs: social cards (descriptions) for all 'index.md' * mkdocs: social cards (descriptions) for 'features' and 'plugins' * mkdocs: social cards (descriptions) for 'general_info' * mkdocs: social cards (descriptions) for 'configuration' * mkdocs: social cards (descriptions) for 'installation' * mkdocs: minor fixes * update librechat.svg * update how_to_contribute.md add reference to the official GitHub documentation 2023-12-28 17:10:06 -05:00			`Use [LiteLLM Proxy](https://docs.litellm.ai/docs/simple_proxy) for:`
📚 docs: Add LiteLLM Proxy - Load balance 100+ LLMs & Spend Tracking ⚖️🤖📈 (#1249) * (docs) add instructions on using litellm * Update litellm.md --------- Co-authored-by: Danny Avila <110412045+danny-avila@users.noreply.github.com> 2023-11-30 10:59:16 -08:00			`* Calling 100+ LLMs Huggingface/Bedrock/TogetherAI/etc. in the OpenAI ChatCompletions & Completions format`
			`* Load balancing - between Multiple Models + Deployments of the same model LiteLLM proxy can handle 1k+ requests/second during load tests`
			`* Authentication & Spend Tracking Virtual Keys`

			`## Start LiteLLM Proxy Server`
			`### Pip install litellm`
			```shell
			`pip install litellm`
			```

			`### Create a config.yaml for litellm proxy`
🪪mkdocs: social cards (#1428) * mkdocs plugins: add plugin for social cards and plugin that allow to exclude a folder * docs: fix hyperlinks * mkdocs: social cards (descriptions) for 'contributions' and 'deployment' guides * mkdocs: social cards (descriptions) for all 'index.md' * mkdocs: social cards (descriptions) for 'features' and 'plugins' * mkdocs: social cards (descriptions) for 'general_info' * mkdocs: social cards (descriptions) for 'configuration' * mkdocs: social cards (descriptions) for 'installation' * mkdocs: minor fixes * update librechat.svg * update how_to_contribute.md add reference to the official GitHub documentation 2023-12-28 17:10:06 -05:00			`More information on LiteLLM configurations here: [docs.litellm.ai/docs/simple_proxy](https://docs.litellm.ai/docs/simple_proxy)`
📚 docs: Add LiteLLM Proxy - Load balance 100+ LLMs & Spend Tracking ⚖️🤖📈 (#1249) * (docs) add instructions on using litellm * Update litellm.md --------- Co-authored-by: Danny Avila <110412045+danny-avila@users.noreply.github.com> 2023-11-30 10:59:16 -08:00
			```yaml
			`model_list:`
			`- model_name: gpt-3.5-turbo`
			`litellm_params:`
			`model: azure/gpt-turbo-small-eu`
			`api_base: https://my-endpoint-europe-berri-992.openai.azure.com/`
			`api_key:`
			`rpm: 6 # Rate limit for this deployment: in requests per minute (rpm)`
			`- model_name: gpt-3.5-turbo`
			`litellm_params:`
			`model: azure/gpt-turbo-small-ca`
			`api_base: https://my-endpoint-canada-berri992.openai.azure.com/`
			`api_key:`
			`rpm: 6`
			`- model_name: gpt-3.5-turbo`
			`litellm_params:`
			`model: azure/gpt-turbo-large`
			`api_base: https://openai-france-1234.openai.azure.com/`
			`api_key:`
			`rpm: 1440`
			```

			`### Start the proxy`
			```shell
			`litellm --config /path/to/config.yaml`

			`#INFO: Proxy running on http://0.0.0.0:8000`
			```

			`## Use LiteLLM Proxy Server with LibreChat`


			`#### 1. Clone the repo`
			```shell
			`git clone https://github.com/danny-avila/LibreChat.git`
			```


			#### 2. Modify Librechat's `docker-compose.yml`
			```yaml
			`OPENAI_REVERSE_PROXY=http://host.docker.internal:8000/v1/chat/completions`
			```

💫 feat: Config File & Custom Endpoints (#1474) * WIP(backend/api): custom endpoint * WIP(frontend/client): custom endpoint * chore: adjust typedefs for configs * refactor: use data-provider for cache keys and rename enums and custom endpoint for better clarity and compatibility * feat: loadYaml utility * refactor: rename back to from and proof-of-concept for creating schemas from user-defined defaults * refactor: remove custom endpoint from default endpointsConfig as it will be exclusively managed by yaml config * refactor(EndpointController): rename variables for clarity * feat: initial load custom config * feat(server/utils): add simple `isUserProvided` helper * chore(types): update TConfig type * refactor: remove custom endpoint handling from model services as will be handled by config, modularize fetching of models * feat: loadCustomConfig, loadConfigEndpoints, loadConfigModels * chore: reorganize server init imports, invoke loadCustomConfig * refactor(loadConfigEndpoints/Models): return each custom endpoint as standalone endpoint * refactor(Endpoint/ModelController): spread config values after default (temporary) * chore(client): fix type issues * WIP: first pass for multiple custom endpoints - add endpointType to Conversation schema - add update zod schemas for both convo/presets to allow non-EModelEndpoint value as endpoint (also using type assertion) - use `endpointType` value as `endpoint` where mapping to type is necessary using this field - use custom defined `endpoint` value and not type for mapping to modelsConfig - misc: add return type to `getDefaultEndpoint` - in `useNewConvo`, add the endpointType if it wasn't already added to conversation - EndpointsMenu: use user-defined endpoint name as Title in menu - TODO: custom icon via custom config, change unknown to robot icon * refactor(parseConvo): pass args as an object and change where used accordingly; chore: comment out 'create schema' code * chore: remove unused availableModels field in TConfig type * refactor(parseCompactConvo): pass args as an object and change where used accordingly * feat: chat through custom endpoint * chore(message/convoSchemas): avoid saving empty arrays * fix(BaseClient/saveMessageToDatabase): save endpointType * refactor(ChatRoute): show Spinner if endpointsQuery or modelsQuery are still loading, which is apparent with slow fetching of models/remote config on first serve * fix(useConversation): assign endpointType if it's missing * fix(SaveAsPreset): pass real endpoint and endpointType when saving Preset) * chore: recorganize types order for TConfig, add `iconURL` * feat: custom endpoint icon support: - use UnknownIcon in all icon contexts - add mistral and openrouter as known endpoints, and add their icons - iconURL support * fix(presetSchema): move endpointType to default schema definitions shared between convoSchema and defaults * refactor(Settings/OpenAI): remove legacy `isOpenAI` flag * fix(OpenAIClient): do not invoke abortCompletion on completion error * feat: add responseSender/label support for custom endpoints: - use defaultModelLabel field in endpointOption - add model defaults for custom endpoints in `getResponseSender` - add `useGetSender` hook which uses EndpointsQuery to determine `defaultModelLabel` - include defaultModelLabel from endpointConfig in custom endpoint client options - pass `endpointType` to `getResponseSender` * feat(OpenAIClient): use custom options from config file * refactor: rename `defaultModelLabel` to `modelDisplayLabel` * refactor(data-provider): separate concerns from `schemas` into `parsers`, `config`, and fix imports elsewhere * feat: `iconURL` and extract environment variables from custom endpoint config values * feat: custom config validation via zod schema, rename and move to `./projectRoot/librechat.yaml` * docs: custom config docs and examples * fix(OpenAIClient/mistral): mistral does not allow singular system message, also add `useChatCompletion` flag to use openai-node for title completions * fix(custom/initializeClient): extract env var and use `isUserProvided` function * Update librechat.example.yaml * feat(InputWithLabel): add className props, and forwardRef * fix(streamResponse): handle error edge case where either messages or convos query throws an error * fix(useSSE): handle errorHandler edge cases where error response is and is not properly formatted from API, especially when a conversationId is not yet provided, which ensures stream is properly closed on error * feat: user_provided keys for custom endpoints * fix(config/endpointSchema): do not allow default endpoint values in custom endpoint `name` * feat(loadConfigModels): extract env variables and optimize fetching models * feat: support custom endpoint iconURL for messages and Nav * feat(OpenAIClient): add/dropParams support * docs: update docs with default params, add/dropParams, and notes to use config file instead of `OPENAI_REVERSE_PROXY` * docs: update docs with additional notes * feat(maxTokensMap): add mistral models (32k context) * docs: update openrouter notes * Update ai_setup.md * docs(custom_config): add table of contents and fix note about custom name * docs(custom_config): reorder ToC * Update custom_config.md * Add note about `max_tokens` field in custom_config.md 2024-01-03 09:22:48 -05:00			Important: As of v0.6.6, it's recommend you use the `librechat.yaml` [Configuration file (guide here)](./custom_config.md) to add Reverse Proxies as separate endpoints.

📚 docs: Add LiteLLM Proxy - Load balance 100+ LLMs & Spend Tracking ⚖️🤖📈 (#1249) * (docs) add instructions on using litellm * Update litellm.md --------- Co-authored-by: Danny Avila <110412045+danny-avila@users.noreply.github.com> 2023-11-30 10:59:16 -08:00			#### 3. Save fake OpenAI key in Librechat's `.env`

			Copy Librechat's `.env.example` to `.env` and overwrite the default OPENAI_API_KEY (by default it requires the user to pass a key).
			```env
			`OPENAI_API_KEY=sk-1234`
			```

			`#### 4. Run LibreChat:`
			```shell
			`docker compose up`
			```

			`---`

			`### Why use LiteLLM?`

			`1. Access to Multiple LLMs: It allows calling over 100 LLMs from platforms like Huggingface, Bedrock, TogetherAI, etc., using OpenAI's ChatCompletions and Completions format.`

			`2. Load Balancing: Capable of handling over 1,000 requests per second during load tests, it balances load across various models and deployments.`

			`3. Authentication & Spend Tracking: The server supports virtual keys for authentication and tracks spending.`

			`Key components and features include:`

			`- Installation: Easy installation.`
			`- Testing: Testing features to route requests to specific models.`
			`- Server Endpoints: Offers multiple endpoints for chat completions, completions, embeddings, model lists, and key generation.`
			`- Supported LLMs: Supports a wide range of LLMs, including AWS Bedrock, Azure OpenAI, Huggingface, AWS Sagemaker, Anthropic, and more.`
			`- Proxy Configurations: Allows setting various parameters like model list, server settings, environment variables, and more.`
			`- Multiple Models Management: Configurations can be set up for managing multiple models with fallbacks, cooldowns, retries, and timeouts.`
			`- Embedding Models Support: Special configurations for embedding models.`
			`- Authentication Management: Features for managing authentication through virtual keys, model upgrades/downgrades, and tracking spend.`
			`- Custom Configurations: Supports setting model-specific parameters, caching responses, and custom prompt templates.`
			`- Debugging Tools: Options for debugging and logging proxy input/output.`
			`- Deployment and Performance: Information on deploying LiteLLM Proxy and its performance metrics.`
			`- Proxy CLI Arguments: A wide range of command-line arguments for customization.`

			`Overall, LiteLLM Server offers a comprehensive suite of tools for managing, deploying, and interacting with a variety of LLMs, making it a versatile choice for large-scale AI applications.`