feat: Google Gemini ❇️ (#1355)

* refactor: add gemini-pro to google Models list; use defaultModels for central model listing * refactor(SetKeyDialog): create useMultipleKeys hook to use for Azure, export `isJson` from utils, use EModelEndpoint * refactor(useUserKey): change variable names to make keyName setting more clear * refactor(FileUpload): allow passing container className string * feat(GoogleClient): Gemini support * refactor(GoogleClient): alternate stream speed for Gemini models * feat(Gemini): styling/settings configuration for Gemini * refactor(GoogleClient): substract max response tokens from max context tokens if context is above 32k (I/O max is combined between the two) * refactor(tokens): correct google max token counts and subtract max response tokens when input/output count are combined towards max context count * feat(google/initializeClient): handle both local and user_provided credentials and write tests * fix(GoogleClient): catch if credentials are undefined, handle if serviceKey is string or object correctly, handle no examples passed, throw error if not a Generative Language model and no service account JSON key is provided, throw error if it is a Generative m odel, but not google API key was provided * refactor(loadAsyncEndpoints/google): activate Google endpoint if either the service key JSON file is provided in /api/data, or a GOOGLE_KEY is defined. * docs: updated Google configuration * fix(ci): Mock import of Service Account Key JSON file (auth.json) * Update apis_and_tokens.md * feat: increase max output tokens slider for gemini pro * refactor(GoogleSettings): handle max and default maxOutputTokens on model change * chore: add sensitive redact regex * docs: add warning about data privacy * Update apis_and_tokens.md
2026-02-02 15:51:49 +01:00 · 2023-12-15 02:18:07 -05:00 · 2023-12-15 02:18:07 -05:00 · 561ce8e86a
commit 561ce8e86a
parent d259431316
37 changed files with 702 additions and 219 deletions
--- a/api/utils/tokens.js
+++ b/api/utils/tokens.js
@ -56,11 +56,12 @@ const maxTokensMap = {
    'gpt-4-1106': 127995, // -5 from max
  },
  [EModelEndpoint.google]: {
-    /* Max I/O is 32k combined, so -1000 to leave room for response */
-    'text-bison-32k': 31000,
-    'chat-bison-32k': 31000,
-    'code-bison-32k': 31000,
-    'codechat-bison-32k': 31000,
+    /* Max I/O is combined so we subtract the amount from max response tokens for actual total */
+    gemini: 32750, // -10 from max
+    'text-bison-32k': 32758, // -10 from max
+    'chat-bison-32k': 32758, // -10 from max
+    'code-bison-32k': 32758, // -10 from max
+    'codechat-bison-32k': 32758,
    /* Codey, -5 from max: 6144 */
    'code-': 6139,
    'codechat-': 6139,