📸 feat: Gemini vision, Improved Logs and Multi-modal Handling (#1368)

* feat: add GOOGLE_MODELS env var

* feat: add gemini vision support

* refactor(GoogleClient): adjust clientOptions handling depending on model

* fix(logger): fix redact logic and redact errors only

* fix(GoogleClient): do not allow non-multiModal messages when gemini-pro-vision is selected

* refactor(OpenAIClient): use `isVisionModel` client property to avoid calling validateVisionModel multiple times

* refactor: better debug logging by correctly traversing, redacting sensitive info, and logging condensed versions of long values

* refactor(GoogleClient): allow response errors to be thrown/caught above client handling so user receives meaningful error message
debug orderedMessages, parentMessageId, and buildMessages result

* refactor(AskController): use model from client.modelOptions.model when saving intermediate messages, which requires for the progress callback to be initialized after the client is initialized

* feat(useSSE): revert to previous model if the model was auto-switched by backend due to message attachments

* docs: update with google updates, notes about Gemini Pro Vision

* fix: redis should not be initialized without USE_REDIS and increase max listeners to 20
This commit is contained in:
Danny Avila 2023-12-16 20:45:27 -05:00 committed by GitHub
parent 676f133545
commit 0c326797dd
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
21 changed files with 356 additions and 210 deletions

View file

@ -31,7 +31,7 @@
# Features
- 🖥️ UI matching ChatGPT, including Dark mode, Streaming, and 11-2023 updates
- 💬 Multimodal Chat:
- Upload and analyze images with GPT-4-Vision 📸
- Upload and analyze images with GPT-4 and Gemini Vision 📸
- More filetypes and Assistants API integration in Active Development 🚧
- 🌎 Multilingual UI:
- English, 中文, Deutsch, Español, Français, Italiano, Polski, Português Brasileiro, Русский

View file

@ -70,10 +70,6 @@ For Vertex AI, you need a Service Account JSON key file, with appropriate access
Instructions for both are given below.
Setting `GOOGLE_KEY=user_provided` in your .env file will configure both values to be provided from the client (or frontend) like so:
![image](https://github.com/danny-avila/LibreChat/assets/110412045/728cbc04-4180-45a8-848c-ae5de2b02996)
### Generative Language API (Gemini)
**60 Gemini requests/minute are currently free until early next year when it enters general availability.**
@ -85,21 +81,22 @@ To use Gemini models, you'll need an API key. If you don't already have one, cre
<p><a class="button button-primary" href="https://makersuite.google.com/app/apikey" target="_blank" rel="noopener noreferrer">Get an API key here</a></p>
Once you have your key, you can either provide it from the frontend by setting the following:
```bash
GOOGLE_KEY=user_provided
```
Or, provide the key in your .env file, which allows all users of your instance to use it.
Once you have your key, provide the key in your .env file, which allows all users of your instance to use it.
```bash
GOOGLE_KEY=mY_SeCreT_w9347w8_kEY
```
> Notes:
> - As of 12/15/23, Gemini Pro Vision is not yet supported but is planned.
> - PaLM2 and Codey models cannot be accessed through the Generative Language API.
Or, you can make users provide it from the frontend by setting the following:
```bash
GOOGLE_KEY=user_provided
```
Note: PaLM2 and Codey models cannot be accessed through the Generative Language API, only through Vertex AI.
Setting `GOOGLE_KEY=user_provided` in your .env file will configure both the Vertex AI Service Account JSON key file and the Generative Language API key to be provided from the frontend like so:
![image](https://github.com/danny-avila/LibreChat/assets/110412045/728cbc04-4180-45a8-848c-ae5de2b02996)
### Vertex AI (PaLM 2 & Codey)
@ -132,14 +129,15 @@ You can usually get **$300 starting credit**, which makes this option free for 9
**Saving your JSON key file in the project directory which allows all users of your LibreChat instance to use it.**
Alternatively, Once you have your JSON key file, you can also provide it from the frontend on a user-basis by setting the following:
Alternatively, you can make users provide it from the frontend by setting the following:
```bash
# Note: this configures both the Vertex AI Service Account JSON key file
# and the Generative Language API key to be provided from the frontend.
GOOGLE_KEY=user_provided
```
> Notes:
> - As of 12/15/23, Gemini and Gemini Pro Vision are not yet supported through Vertex AI but are planned.
Note: Using Gemini models through Vertex AI is possible but not yet supported.
## Azure OpenAI

View file

@ -199,6 +199,15 @@ GOOGLE_KEY=user_provided
GOOGLE_REVERSE_PROXY=
```
- Customize the available models, separated by commas, **without spaces**.
- The first will be default.
- Leave it blank or commented out to use internal settings (default: all listed below).
```bash
# all available models as of 12/16/23
GOOGLE_MODELS=gemini-pro,gemini-pro-vision,chat-bison,chat-bison-32k,codechat-bison,codechat-bison-32k,text-bison,text-bison-32k,text-unicorn,code-gecko,code-bison,code-bison-32k
```
### OpenAI
- To get your OpenAI API key, you need to: