📸 feat: Gemini vision, Improved Logs and Multi-modal Handling (#1368)

* feat: add GOOGLE_MODELS env var * feat: add gemini vision support * refactor(GoogleClient): adjust clientOptions handling depending on model * fix(logger): fix redact logic and redact errors only * fix(GoogleClient): do not allow non-multiModal messages when gemini-pro-vision is selected * refactor(OpenAIClient): use `isVisionModel` client property to avoid calling validateVisionModel multiple times * refactor: better debug logging by correctly traversing, redacting sensitive info, and logging condensed versions of long values * refactor(GoogleClient): allow response errors to be thrown/caught above client handling so user receives meaningful error message debug orderedMessages, parentMessageId, and buildMessages result * refactor(AskController): use model from client.modelOptions.model when saving intermediate messages, which requires for the progress callback to be initialized after the client is initialized * feat(useSSE): revert to previous model if the model was auto-switched by backend due to message attachments * docs: update with google updates, notes about Gemini Pro Vision * fix: redis should not be initialized without USE_REDIS and increase max listeners to 20
2026-02-03 08:11:50 +01:00 · 2023-12-16 20:45:27 -05:00 · 2023-12-16 20:45:27 -05:00 · 0c326797dd
commit 0c326797dd
parent 676f133545
21 changed files with 356 additions and 210 deletions
--- a/docs/index.md
+++ b/docs/index.md
@ -31,7 +31,7 @@
 # Features
 - 🖥️ UI matching ChatGPT, including Dark mode, Streaming, and 11-2023 updates
 - 💬 Multimodal Chat:
-     - Upload and analyze images with GPT-4-Vision 📸 
+     - Upload and analyze images with GPT-4 and Gemini Vision 📸 
     - More filetypes and Assistants API integration in Active Development 🚧 
 - 🌎 Multilingual UI:
     - English, 中文, Deutsch, Español, Français, Italiano, Polski, Português Brasileiro, Русский
--- a/docs/install/apis_and_tokens.md
+++ b/docs/install/apis_and_tokens.md
@ -70,10 +70,6 @@ For Vertex AI, you need a Service Account JSON key file, with appropriate access

 Instructions for both are given below.

-Setting `GOOGLE_KEY=user_provided` in your .env file will configure both values to be provided from the client (or frontend) like so:
-
-![image](https://github.com/danny-avila/LibreChat/assets/110412045/728cbc04-4180-45a8-848c-ae5de2b02996)
-
 ### Generative Language API (Gemini)

 **60 Gemini requests/minute are currently free until early next year when it enters general availability.**
@ -85,21 +81,22 @@ To use Gemini models, you'll need an API key. If you don't already have one, cre

 <p><a class="button button-primary" href="https://makersuite.google.com/app/apikey" target="_blank" rel="noopener noreferrer">Get an API key here</a></p>

-Once you have your key, you can either provide it from the frontend by setting the following:
-
-```bash
-GOOGLE_KEY=user_provided
-```
-
-Or, provide the key in your .env file, which allows all users of your instance to use it.
+Once you have your key, provide the key in your .env file, which allows all users of your instance to use it.

 ```bash
 GOOGLE_KEY=mY_SeCreT_w9347w8_kEY
 ```

-> Notes:
-> - As of 12/15/23, Gemini Pro Vision is not yet supported but is planned.
-> - PaLM2 and Codey models cannot be accessed through the Generative Language API.
+Or, you can make users provide it from the frontend by setting the following:
+```bash
+GOOGLE_KEY=user_provided
+```
+
+Note: PaLM2 and Codey models cannot be accessed through the Generative Language API, only through Vertex AI.
+
+Setting `GOOGLE_KEY=user_provided` in your .env file will configure both the Vertex AI Service Account JSON key file and the Generative Language API key to be provided from the frontend like so:
+
+![image](https://github.com/danny-avila/LibreChat/assets/110412045/728cbc04-4180-45a8-848c-ae5de2b02996)

 ### Vertex AI (PaLM 2 & Codey)

@ -132,14 +129,15 @@ You can usually get **$300 starting credit**, which makes this option free for 9

 **Saving your JSON key file in the project directory which allows all users of your LibreChat instance to use it.**

-Alternatively, Once you have your JSON key file, you can also provide it from the frontend on a user-basis by setting the following:
+Alternatively, you can make users provide it from the frontend by setting the following:

 ```bash
+# Note: this configures both the Vertex AI Service Account JSON key file
+# and the Generative Language API key to be provided from the frontend.
 GOOGLE_KEY=user_provided
 ```

-> Notes:
-> - As of 12/15/23, Gemini and Gemini Pro Vision are not yet supported through Vertex AI but are planned.
+Note: Using Gemini models through Vertex AI is possible but not yet supported.

 ## Azure OpenAI

--- a/docs/install/dotenv.md
+++ b/docs/install/dotenv.md
@ -199,6 +199,15 @@ GOOGLE_KEY=user_provided
 GOOGLE_REVERSE_PROXY=
 ```

+- Customize the available models, separated by commas, **without spaces**.
+    - The first will be default.
+    - Leave it blank or commented out to use internal settings (default: all listed below).
+
+```bash
+# all available models as of 12/16/23
+GOOGLE_MODELS=gemini-pro,gemini-pro-vision,chat-bison,chat-bison-32k,codechat-bison,codechat-bison-32k,text-bison,text-bison-32k,text-unicorn,code-gecko,code-bison,code-bison-32k
+```
+
 ### OpenAI

 - To get your OpenAI API key, you need to: