🤗 feat: Known Endpoints: HuggingFace (#2646)

* endpoints: huggingface * Update ai_endpoints.md * huggingface: update icon
2025-12-16 16:30:15 +01:00 · 2024-05-09 14:26:47 -04:00 · 2024-05-09 14:26:47 -04:00 · d73ea8e1f2
commit d73ea8e1f2
parent 83bae9e9d9
7 changed files with 137 additions and 35 deletions
--- a/.env.example
+++ b/.env.example
@ -68,6 +68,7 @@ PROXY=
 # APIPIE_API_KEY=
 # FIREWORKS_API_KEY=
 # GROQ_API_KEY=
+# HUGGINGFACE_TOKEN=
 # MISTRAL_API_KEY=
 # OPENROUTER_KEY=
 # PERPLEXITY_API_KEY=
--- a/client/public/assets/huggingface.svg
+++ b/client/public/assets/huggingface.svg
--- a/client/src/components/Chat/Menus/Endpoints/UnknownIcon.tsx
+++ b/client/src/components/Chat/Menus/Endpoints/UnknownIcon.tsx
@ -8,6 +8,7 @@ const knownEndpointAssets = {
  [KnownEndpoints.cohere]: '/assets/cohere.png',
  [KnownEndpoints.fireworks]: '/assets/fireworks.png',
  [KnownEndpoints.groq]: '/assets/groq.png',
+  [KnownEndpoints.huggingface]: '/assets/huggingface.svg',
  [KnownEndpoints.mistral]: '/assets/mistral.png',
  [KnownEndpoints.mlx]: '/assets/mlx.png',
  [KnownEndpoints.ollama]: '/assets/ollama.png',
--- a/docs/install/configuration/ai_endpoints.md
+++ b/docs/install/configuration/ai_endpoints.md
@ -123,6 +123,39 @@ Some of the endpoints are marked as **Known,** which means they might have speci

 ![image](https://github.com/danny-avila/LibreChat/assets/32828263/b6a21524-b309-4a51-8b88-c280fb330af4)

+## Apple MLX
+> MLX API key: ignored - [MLX OpenAI Compatibility](https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/SERVER.md)
+
+**Notes:**
+
+- **Known:** icon provided.
+
+- API is mostly strict with unrecognized parameters.
+- Support only one model at a time, otherwise you'll need to run a different endpoint with a different `baseURL`.
+
+```yaml
+    - name: "MLX"
+      apiKey: "mlx"
+      baseURL: "http://localhost:8080/v1/" 
+      models:
+        default: [
+          "Meta-Llama-3-8B-Instruct-4bit"
+          ]
+        fetch: false # fetching list of models is not supported
+      titleConvo: true
+      titleModel: "current_model"
+      summarize: false
+      summaryModel: "current_model"
+      forcePrompt: false
+      modelDisplayLabel: "Apple MLX"
+      addParams:
+            max_tokens: 2000
+            "stop": [
+              "<|eot_id|>"
+            ]
+```
+
+![image](https://github.com/danny-avila/LibreChat/blob/ae9d88b68c95fdb46787bca1df69407d2dd4e8dc/client/public/assets/mlx.png)

 ## Cohere
 > Cohere API key: [dashboard.cohere.com](https://dashboard.cohere.com/)
@ -204,7 +237,6 @@ Some of the endpoints are marked as **Known,** which means they might have speci
        default: [
          "llama3-70b-8192",
          "llama3-8b-8192",
-          "llama2-70b-4096",
          "mixtral-8x7b-32768",
          "gemma-7b-it",
          ]
@ -216,6 +248,97 @@ Some of the endpoints are marked as **Known,** which means they might have speci

 ![image](https://github.com/danny-avila/LibreChat/assets/110412045/cc4f0710-7e27-4f82-8b4f-81f788a6cb13)

+## Huggingface
+> groq API key: [wow.groq.com](https://wow.groq.com/)
+
+**Notes:**
+
+- **Known:** icon provided.
+
+- The provided models are free but rate limited
+
+  - The use of [`dropParams`](./custom_config.md#dropparams) to drop "top_p" params is required.
+  - Fetching models isn't supported
+  - Note: Some models currently work better than others, answers are very short (at least when using the free tier).
+
+- The example includes a model list, which was last updated on May 09, 2024, for your convenience.
+
+```yaml
+   - name: 'HuggingFace'
+      apiKey: '${HUGGINGFACE_TOKEN}'
+      baseURL: 'https://api-inference.huggingface.co/v1'
+      models:
+        default: [
+          "codellama/CodeLlama-34b-Instruct-hf",
+          "google/gemma-1.1-2b-it",
+          "google/gemma-1.1-7b-it",
+          "HuggingFaceH4/starchat2-15b-v0.1",
+          "HuggingFaceH4/zephyr-7b-beta",
+          "meta-llama/Meta-Llama-3-8B-Instruct",
+          "microsoft/Phi-3-mini-4k-instruct",
+          "mistralai/Mistral-7B-Instruct-v0.1",
+          "mistralai/Mistral-7B-Instruct-v0.2",
+          "mistralai/Mixtral-8x7B-Instruct-v0.1",
+          "NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO",
+        ]
+        fetch: true
+      titleConvo: true
+      titleModel: "NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO"
+      dropParams: ["top_p"]
+      modelDisplayLabel: "HuggingFace"
+```
+
+??? warning "Other Model Errors"
+
+    Here’s a list of the other models that were tested along with their corresponding errors
+    
+    ```yaml
+      models:
+        default: [
+          "CohereForAI/c4ai-command-r-plus", # Model requires a Pro subscription
+          "HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1", # Model requires a Pro subscription
+          "meta-llama/Llama-2-7b-hf", # Model requires a Pro subscription
+          "meta-llama/Meta-Llama-3-70B-Instruct", # Model requires a Pro subscription
+          "meta-llama/Llama-2-13b-chat-hf", # Model requires a Pro subscription
+          "meta-llama/Llama-2-13b-hf", # Model requires a Pro subscription
+          "meta-llama/Llama-2-70b-chat-hf", # Model requires a Pro subscription
+          "meta-llama/Llama-2-7b-chat-hf", # Model requires a Pro subscription
+          "------",
+          "bigcode/octocoder", # template not found
+          "bigcode/santacoder", # template not found
+          "bigcode/starcoder2-15b", # template not found
+          "bigcode/starcoder2-3b", # template not found 
+          "codellama/CodeLlama-13b-hf", # template not found
+          "codellama/CodeLlama-7b-hf", # template not found
+          "google/gemma-2b", # template not found
+          "google/gemma-7b", # template not found
+          "HuggingFaceH4/starchat-beta", # template not found
+          "HuggingFaceM4/idefics-80b-instruct", # template not found
+          "HuggingFaceM4/idefics-9b-instruct", # template not found
+          "HuggingFaceM4/idefics2-8b", # template not found
+          "kashif/stack-llama-2", # template not found
+          "lvwerra/starcoderbase-gsm8k", # template not found
+          "tiiuae/falcon-7b", # template not found
+          "timdettmers/guanaco-33b-merged", # template not found
+          "------",
+          "bigscience/bloom", # 404 status code (no body)
+          "------",
+          "google/gemma-2b-it", # stream` is not supported for this model / unknown error
+          "------",
+          "google/gemma-7b-it", # AI Response error likely caused by Google censor/filter
+          "------",
+          "bigcode/starcoder", # Service Unavailable
+          "google/flan-t5-xxl", # Service Unavailable
+          "HuggingFaceH4/zephyr-7b-alpha", # Service Unavailable
+          "mistralai/Mistral-7B-v0.1", # Service Unavailable
+          "OpenAssistant/oasst-sft-1-pythia-12b", # Service Unavailable
+          "OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5", # Service Unavailable
+        ]
+    ```
+
+
+![image](https://github.com/danny-avila/LibreChat/assets/32828263/191a3735-3acb-4ba7-917d-b930a933fc67)
+

 ## LiteLLM
 > LiteLLM API key: master_key value [LiteLLM](./litellm.md)
@ -271,39 +394,6 @@ Some of the endpoints are marked as **Known,** which means they might have speci

 ![image](https://github.com/danny-avila/LibreChat/assets/110412045/ddb4b2f3-608e-4034-9a27-3e94fc512034)

-## Apple MLX
-> MLX API key: ignored - [MLX OpenAI Compatibility](https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/SERVER.md)
-
-**Notes:**
-
- **Known:** icon provided.
-
- API is mostly strict with unrecognized parameters.
- Support only one model at a time, otherwise you'll need to run a different endpoint with a different `baseURL`.
-
-```yaml
-    - name: "MLX"
-      apiKey: "mlx"
-      baseURL: "http://localhost:8080/v1/" 
-      models:
-        default: [
-          "Meta-Llama-3-8B-Instruct-4bit"
-          ]
-        fetch: false # fetching list of models is not supported
-      titleConvo: true
-      titleModel: "current_model"
-      summarize: false
-      summaryModel: "current_model"
-      forcePrompt: false
-      modelDisplayLabel: "Apple MLX"
-      addParams:
-            max_tokens: 2000
-            "stop": [
-              "<|eot_id|>"
-            ]
-```
-
-![image](https://github.com/danny-avila/LibreChat/blob/ae9d88b68c95fdb46787bca1df69407d2dd4e8dc/client/public/assets/mlx.png)

 ## Ollama
 > Ollama API key: Required but ignored - [Ollama OpenAI Compatibility](https://github.com/ollama/ollama/blob/main/docs/openai.md)
--- a/docs/install/configuration/dotenv.md
+++ b/docs/install/configuration/dotenv.md
@ -166,6 +166,7 @@ ANYSCALE_API_KEY=
 APIPIE_API_KEY=
 FIREWORKS_API_KEY=
 GROQ_API_KEY=
+HUGGINGFACE_TOKEN=
 MISTRAL_API_KEY=
 OPENROUTER_KEY=
 PERPLEXITY_API_KEY=
--- a/librechat.example.yaml
+++ b/librechat.example.yaml
@ -2,7 +2,7 @@
 # https://docs.librechat.ai/install/configuration/custom_config.html

 # Configuration version (required)
-version: 1.0.6
+version: 1.0.9

 # Cache settings: Set to true to enable caching
 cache: true
--- a/packages/data-provider/src/config.ts
+++ b/packages/data-provider/src/config.ts
@ -306,6 +306,7 @@ export enum KnownEndpoints {
  cohere = 'cohere',
  fireworks = 'fireworks',
  groq = 'groq',
+  huggingface = 'huggingface',
  mistral = 'mistral',
  mlx = 'mlx',
  ollama = 'ollama',