🤖 feat: OpenAI Assistants v2 (initial support) (#2781)

* 🤖 Assistants V2 Support: Part 1 - Separated Azure Assistants to its own endpoint - File Search / Vector Store integration is incomplete, but can toggle and use storage from playground - Code Interpreter resource files can be added but not deleted - GPT-4o is supported - Many improvements to the Assistants Endpoint overall data-provider v2 changes copy existing route as v1 chore: rename new endpoint to reduce comparison operations and add new azure filesource api: add azureAssistants part 1 force use of version for assistants/assistantsAzure chore: switch name back to azureAssistants refactor type version: string | number Ensure assistants endpoints have version set fix: isArchived type issue in ConversationListParams refactor: update assistants mutations/queries with endpoint/version definitions, update Assistants Map structure chore: FilePreview component ExtendedFile type assertion feat: isAssistantsEndpoint helper chore: remove unused useGenerations chore(buildTree): type issue chore(Advanced): type issue (unused component, maybe in future) first pass for multi-assistant endpoint rewrite fix(listAssistants): pass params correctly feat: list separate assistants by endpoint fix(useTextarea): access assistantMap correctly fix: assistant endpoint switching, resetting ID fix: broken during rewrite, selecting assistant mention fix: set/invalidate assistants endpoint query data correctly feat: Fix issue with assistant ID not being reset correctly getOpenAIClient helper function feat: add toast for assistant deletion fix: assistants delete right after create issue for azure fix: assistant patching refactor: actions to use getOpenAIClient refactor: consolidate logic into helpers file fix: issue where conversation data was not initially available v1 chat support refactor(spendTokens): only early return if completionTokens isNaN fix(OpenAIClient): ensure spendTokens has all necessary params refactor: route/controller logic fix(assistants/initializeClient): use defaultHeaders field fix: sanitize default operation id chore: bump openai package first pass v2 action service feat: retroactive domain parsing for actions added via v1 feat: delete db records of actions/assistants on openai assistant deletion chore: remove vision tools from v2 assistants feat: v2 upload and delete assistant vision images WIP first pass, thread attachments fix: show assistant vision files (save local/firebase copy) v2 image continue fix: annotations fix: refine annotations show analyze as error if is no longer submitting before progress reaches 1 and show file_search as retrieval tool fix: abort run, undefined endpoint issue refactor: consolidate capabilities logic and anticipate versioning frontend version 2 changes fix: query selection and filter add endpoint to unknown filepath add file ids to resource, deleting in progress enable/disable file search remove version log * 🤖 Assistants V2 Support: Part 2 🎹 fix: Autocompletion Chrome Bug on Action API Key Input chore: remove `useOriginNavigate` chore: set correct OpenAI Storage Source fix: azure file deletions, instantiate clients by source for deletion update code interpret files info feat: deleteResourceFileId chore: increase poll interval as azure easily rate limits fix: openai file deletions, TODO: evaluate rejected deletion settled promises to determine which to delete from db records file source icons update table file filters chore: file search info and versioning fix: retrieval update with necessary tool_resources if specified fix(useMentions): add optional chaining in case listMap value is undefined fix: force assistant avatar roundedness fix: azure assistants, check correct flag chore: bump data-provider * fix: merge conflict * ci: fix backend tests due to new updates * chore: update .env.example * meilisearch improvements * localization updates * chore: update comparisons * feat: add additional metadata: endpoint, author ID * chore: azureAssistants ENDPOINTS exclusion warning
2025-12-17 17:00:15 +01:00 · 2024-05-19 12:56:55 -04:00 · 2024-05-19 12:56:55 -04:00 · 1a452121fa
commit 1a452121fa
parent af8bcb08d6
158 changed files with 4184 additions and 1204 deletions
--- a/api/server/routes/assistants/actions.js
+++ b/api/server/routes/assistants/actions.js
@ -2,7 +2,7 @@ const { v4 } = require('uuid');
 const express = require('express');
 const { encryptMetadata, domainParser } = require('~/server/services/ActionService');
 const { actionDelimiter, EModelEndpoint } = require('librechat-data-provider');
-const { initializeClient } = require('~/server/services/Endpoints/assistants');
+const { getOpenAIClient } = require('~/server/controllers/assistants/helpers');
 const { updateAction, getActions, deleteAction } = require('~/models/Action');
 const { updateAssistant, getAssistant } = require('~/models/Assistant');
 const { logger } = require('~/config');
@ -45,7 +45,6 @@ router.post('/:assistant_id', async (req, res) => {
    let metadata = encryptMetadata(_metadata);

    let { domain } = metadata;
-    /* Azure doesn't support periods in function names */
    domain = await domainParser(req, domain, true);

    if (!domain) {
@ -55,8 +54,7 @@ router.post('/:assistant_id', async (req, res) => {
    const action_id = _action_id ?? v4();
    const initialPromises = [];

-    /** @type {{ openai: OpenAI }} */
-    const { openai } = await initializeClient({ req, res });
+    const { openai } = await getOpenAIClient({ req, res });

    initialPromises.push(getAssistant({ assistant_id }));
    initialPromises.push(openai.beta.assistants.retrieve(assistant_id));
@ -157,9 +155,7 @@ router.delete('/:assistant_id/:action_id/:model', async (req, res) => {
  try {
    const { assistant_id, action_id, model } = req.params;
    req.body.model = model;
-
-    /** @type {{ openai: OpenAI }} */
-    const { openai } = await initializeClient({ req, res });
+    const { openai } = await getOpenAIClient({ req, res });

    const initialPromises = [];
    initialPromises.push(getAssistant({ assistant_id }));
--- a/api/server/routes/assistants/assistants.js
+++ b/api/server/routes/assistants/assistants.js
@ -1,271 +0,0 @@
-const multer = require('multer');
-const express = require('express');
-const { FileContext, EModelEndpoint } = require('librechat-data-provider');
-const {
-  initializeClient,
-  listAssistantsForAzure,
-  listAssistants,
-} = require('~/server/services/Endpoints/assistants');
-const { getStrategyFunctions } = require('~/server/services/Files/strategies');
-const { uploadImageBuffer } = require('~/server/services/Files/process');
-const { updateAssistant, getAssistants } = require('~/models/Assistant');
-const { deleteFileByFilter } = require('~/models/File');
-const { logger } = require('~/config');
-const actions = require('./actions');
-const tools = require('./tools');
-
-const upload = multer();
-const router = express.Router();
-
-/**
- * Assistant actions route.
- * @route GET|POST /assistants/actions
- */
-router.use('/actions', actions);
-
-/**
- * Create an assistant.
- * @route GET /assistants/tools
- * @returns {TPlugin[]} 200 - application/json
- */
-router.use('/tools', tools);
-
-/**
- * Create an assistant.
- * @route POST /assistants
- * @param {AssistantCreateParams} req.body - The assistant creation parameters.
- * @returns {Assistant} 201 - success response - application/json
- */
-router.post('/', async (req, res) => {
-  try {
-    /** @type {{ openai: OpenAI }} */
-    const { openai } = await initializeClient({ req, res });
-
-    const { tools = [], ...assistantData } = req.body;
-    assistantData.tools = tools
-      .map((tool) => {
-        if (typeof tool !== 'string') {
-          return tool;
-        }
-
-        return req.app.locals.availableTools[tool];
-      })
-      .filter((tool) => tool);
-
-    if (openai.locals?.azureOptions) {
-      assistantData.model = openai.locals.azureOptions.azureOpenAIApiDeploymentName;
-    }
-
-    const assistant = await openai.beta.assistants.create(assistantData);
-    logger.debug('/assistants/', assistant);
-    res.status(201).json(assistant);
-  } catch (error) {
-    logger.error('[/assistants] Error creating assistant', error);
-    res.status(500).json({ error: error.message });
-  }
-});
-
-/**
- * Retrieves an assistant.
- * @route GET /assistants/:id
- * @param {string} req.params.id - Assistant identifier.
- * @returns {Assistant} 200 - success response - application/json
- */
-router.get('/:id', async (req, res) => {
-  try {
-    /** @type {{ openai: OpenAI }} */
-    const { openai } = await initializeClient({ req, res });
-
-    const assistant_id = req.params.id;
-    const assistant = await openai.beta.assistants.retrieve(assistant_id);
-    res.json(assistant);
-  } catch (error) {
-    logger.error('[/assistants/:id] Error retrieving assistant', error);
-    res.status(500).json({ error: error.message });
-  }
-});
-
-/**
- * Modifies an assistant.
- * @route PATCH /assistants/:id
- * @param {string} req.params.id - Assistant identifier.
- * @param {AssistantUpdateParams} req.body - The assistant update parameters.
- * @returns {Assistant} 200 - success response - application/json
- */
-router.patch('/:id', async (req, res) => {
-  try {
-    /** @type {{ openai: OpenAI }} */
-    const { openai } = await initializeClient({ req, res });
-
-    const assistant_id = req.params.id;
-    const updateData = req.body;
-    updateData.tools = (updateData.tools ?? [])
-      .map((tool) => {
-        if (typeof tool !== 'string') {
-          return tool;
-        }
-
-        return req.app.locals.availableTools[tool];
-      })
-      .filter((tool) => tool);
-
-    if (openai.locals?.azureOptions && updateData.model) {
-      updateData.model = openai.locals.azureOptions.azureOpenAIApiDeploymentName;
-    }
-
-    const updatedAssistant = await openai.beta.assistants.update(assistant_id, updateData);
-    res.json(updatedAssistant);
-  } catch (error) {
-    logger.error('[/assistants/:id] Error updating assistant', error);
-    res.status(500).json({ error: error.message });
-  }
-});
-
-/**
- * Deletes an assistant.
- * @route DELETE /assistants/:id
- * @param {string} req.params.id - Assistant identifier.
- * @returns {Assistant} 200 - success response - application/json
- */
-router.delete('/:id', async (req, res) => {
-  try {
-    /** @type {{ openai: OpenAI }} */
-    const { openai } = await initializeClient({ req, res });
-
-    const assistant_id = req.params.id;
-    const deletionStatus = await openai.beta.assistants.del(assistant_id);
-    res.json(deletionStatus);
-  } catch (error) {
-    logger.error('[/assistants/:id] Error deleting assistant', error);
-    res.status(500).json({ error: 'Error deleting assistant' });
-  }
-});
-
-/**
- * Returns a list of assistants.
- * @route GET /assistants
- * @param {AssistantListParams} req.query - The assistant list parameters for pagination and sorting.
- * @returns {AssistantListResponse} 200 - success response - application/json
- */
-router.get('/', async (req, res) => {
-  try {
-    const { limit = 100, order = 'desc', after, before } = req.query;
-    const query = { limit, order, after, before };
-
-    const azureConfig = req.app.locals[EModelEndpoint.azureOpenAI];
-    /** @type {AssistantListResponse} */
-    let body;
-
-    if (azureConfig?.assistants) {
-      body = await listAssistantsForAzure({ req, res, azureConfig, query });
-    } else {
-      ({ body } = await listAssistants({ req, res, query }));
-    }
-
-    if (req.app.locals?.[EModelEndpoint.assistants]) {
-      /** @type {Partial<TAssistantEndpoint>} */
-      const assistantsConfig = req.app.locals[EModelEndpoint.assistants];
-      const { supportedIds, excludedIds } = assistantsConfig;
-      if (supportedIds?.length) {
-        body.data = body.data.filter((assistant) => supportedIds.includes(assistant.id));
-      } else if (excludedIds?.length) {
-        body.data = body.data.filter((assistant) => !excludedIds.includes(assistant.id));
-      }
-    }
-
-    res.json(body);
-  } catch (error) {
-    logger.error('[/assistants] Error listing assistants', error);
-    res.status(500).json({ message: 'Error listing assistants' });
-  }
-});
-
-/**
- * Returns a list of the user's assistant documents (metadata saved to database).
- * @route GET /assistants/documents
- * @returns {AssistantDocument[]} 200 - success response - application/json
- */
-router.get('/documents', async (req, res) => {
-  try {
-    res.json(await getAssistants({ user: req.user.id }));
-  } catch (error) {
-    logger.error('[/assistants/documents] Error listing assistant documents', error);
-    res.status(500).json({ error: error.message });
-  }
-});
-
-/**
- * Uploads and updates an avatar for a specific assistant.
- * @route POST /avatar/:assistant_id
- * @param {string} req.params.assistant_id - The ID of the assistant.
- * @param {Express.Multer.File} req.file - The avatar image file.
- * @param {string} [req.body.metadata] - Optional metadata for the assistant's avatar.
- * @returns {Object} 200 - success response - application/json
- */
-router.post('/avatar/:assistant_id', upload.single('file'), async (req, res) => {
-  try {
-    const { assistant_id } = req.params;
-    if (!assistant_id) {
-      return res.status(400).json({ message: 'Assistant ID is required' });
-    }
-
-    let { metadata: _metadata = '{}' } = req.body;
-    /** @type {{ openai: OpenAI }} */
-    const { openai } = await initializeClient({ req, res });
-
-    const image = await uploadImageBuffer({
-      req,
-      context: FileContext.avatar,
-      metadata: {
-        buffer: req.file.buffer,
-      },
-    });
-
-    try {
-      _metadata = JSON.parse(_metadata);
-    } catch (error) {
-      logger.error('[/avatar/:assistant_id] Error parsing metadata', error);
-      _metadata = {};
-    }
-
-    if (_metadata.avatar && _metadata.avatar_source) {
-      const { deleteFile } = getStrategyFunctions(_metadata.avatar_source);
-      try {
-        await deleteFile(req, { filepath: _metadata.avatar });
-        await deleteFileByFilter({ filepath: _metadata.avatar });
-      } catch (error) {
-        logger.error('[/avatar/:assistant_id] Error deleting old avatar', error);
-      }
-    }
-
-    const metadata = {
-      ..._metadata,
-      avatar: image.filepath,
-      avatar_source: req.app.locals.fileStrategy,
-    };
-
-    const promises = [];
-    promises.push(
-      updateAssistant(
-        { assistant_id },
-        {
-          avatar: {
-            filepath: image.filepath,
-            source: req.app.locals.fileStrategy,
-          },
-          user: req.user.id,
-        },
-      ),
-    );
-    promises.push(openai.beta.assistants.update(assistant_id, { metadata }));
-
-    const resolved = await Promise.all(promises);
-    res.status(201).json(resolved[1]);
-  } catch (error) {
-    const message = 'An error occurred while updating the Assistant Avatar';
-    logger.error(message, error);
-    res.status(500).json({ message });
-  }
-});
-
-module.exports = router;
--- a/api/server/routes/assistants/chat.js
+++ b/api/server/routes/assistants/chat.js
@ -1,660 +0,0 @@
-const { v4 } = require('uuid');
-const express = require('express');
-const {
-  Constants,
-  RunStatus,
-  CacheKeys,
-  FileSources,
-  ContentTypes,
-  EModelEndpoint,
-  ViolationTypes,
-  ImageVisionTool,
-  AssistantStreamEvents,
-} = require('librechat-data-provider');
-const {
-  initThread,
-  recordUsage,
-  saveUserMessage,
-  checkMessageGaps,
-  addThreadMetadata,
-  saveAssistantMessage,
-} = require('~/server/services/Threads');
-const { sendResponse, sendMessage, sleep, isEnabled, countTokens } = require('~/server/utils');
-const { runAssistant, createOnTextProgress } = require('~/server/services/AssistantService');
-const { addTitle, initializeClient } = require('~/server/services/Endpoints/assistants');
-const { formatMessage, createVisionPrompt } = require('~/app/clients/prompts');
-const { createRun, StreamRunManager } = require('~/server/services/Runs');
-const { getTransactions } = require('~/models/Transaction');
-const checkBalance = require('~/models/checkBalance');
-const { getConvo } = require('~/models/Conversation');
-const getLogStores = require('~/cache/getLogStores');
-const { getModelMaxTokens } = require('~/utils');
-const { logger } = require('~/config');
-
-const router = express.Router();
-const {
-  setHeaders,
-  handleAbort,
-  validateModel,
-  handleAbortError,
-  // validateEndpoint,
-  buildEndpointOption,
-} = require('~/server/middleware');
-
-router.post('/abort', handleAbort());
-
-const ten_minutes = 1000 * 60 * 10;
-
-/**
- * @route POST /
- * @desc Chat with an assistant
- * @access Public
- * @param {express.Request} req - The request object, containing the request data.
- * @param {express.Response} res - The response object, used to send back a response.
- * @returns {void}
- */
-router.post('/', validateModel, buildEndpointOption, setHeaders, async (req, res) => {
-  logger.debug('[/assistants/chat/] req.body', req.body);
-
-  const {
-    text,
-    model,
-    files = [],
-    promptPrefix,
-    assistant_id,
-    instructions,
-    thread_id: _thread_id,
-    messageId: _messageId,
-    conversationId: convoId,
-    parentMessageId: _parentId = Constants.NO_PARENT,
-  } = req.body;
-
-  /** @type {Partial<TAssistantEndpoint>} */
-  const assistantsConfig = req.app.locals?.[EModelEndpoint.assistants];
-
-  if (assistantsConfig) {
-    const { supportedIds, excludedIds } = assistantsConfig;
-    const error = { message: 'Assistant not supported' };
-    if (supportedIds?.length && !supportedIds.includes(assistant_id)) {
-      return await handleAbortError(res, req, error, {
-        sender: 'System',
-        conversationId: convoId,
-        messageId: v4(),
-        parentMessageId: _messageId,
-        error,
-      });
-    } else if (excludedIds?.length && excludedIds.includes(assistant_id)) {
-      return await handleAbortError(res, req, error, {
-        sender: 'System',
-        conversationId: convoId,
-        messageId: v4(),
-        parentMessageId: _messageId,
-      });
-    }
-  }
-
-  /** @type {OpenAIClient} */
-  let openai;
-  /** @type {string|undefined} - the current thread id */
-  let thread_id = _thread_id;
-  /** @type {string|undefined} - the current run id */
-  let run_id;
-  /** @type {string|undefined} - the parent messageId */
-  let parentMessageId = _parentId;
-  /** @type {TMessage[]} */
-  let previousMessages = [];
-  /** @type {import('librechat-data-provider').TConversation | null} */
-  let conversation = null;
-  /** @type {string[]} */
-  let file_ids = [];
-  /** @type {Set<string>} */
-  let attachedFileIds = new Set();
-  /** @type {TMessage | null} */
-  let requestMessage = null;
-  /** @type {undefined | Promise<ChatCompletion>} */
-  let visionPromise;
-
-  const userMessageId = v4();
-  const responseMessageId = v4();
-
-  /** @type {string} - The conversation UUID - created if undefined */
-  const conversationId = convoId ?? v4();
-
-  const cache = getLogStores(CacheKeys.ABORT_KEYS);
-  const cacheKey = `${req.user.id}:${conversationId}`;
-
-  /** @type {Run | undefined} - The completed run, undefined if incomplete */
-  let completedRun;
-
-  const handleError = async (error) => {
-    const defaultErrorMessage =
-      'The Assistant run failed to initialize. Try sending a message in a new conversation.';
-    const messageData = {
-      thread_id,
-      assistant_id,
-      conversationId,
-      parentMessageId,
-      sender: 'System',
-      user: req.user.id,
-      shouldSaveMessage: false,
-      messageId: responseMessageId,
-      endpoint: EModelEndpoint.assistants,
-    };
-
-    if (error.message === 'Run cancelled') {
-      return res.end();
-    } else if (error.message === 'Request closed' && completedRun) {
-      return;
-    } else if (error.message === 'Request closed') {
-      logger.debug('[/assistants/chat/] Request aborted on close');
-    } else if (/Files.*are invalid/.test(error.message)) {
-      const errorMessage = `Files are invalid, or may not have uploaded yet.${
-        req.app.locals?.[EModelEndpoint.azureOpenAI].assistants
-          ? ' If using Azure OpenAI, files are only available in the region of the assistant\'s model at the time of upload.'
-          : ''
-      }`;
-      return sendResponse(res, messageData, errorMessage);
-    } else if (error?.message?.includes('string too long')) {
-      return sendResponse(
-        res,
-        messageData,
-        'Message too long. The Assistants API has a limit of 32,768 characters per message. Please shorten it and try again.',
-      );
-    } else if (error?.message?.includes(ViolationTypes.TOKEN_BALANCE)) {
-      return sendResponse(res, messageData, error.message);
-    } else {
-      logger.error('[/assistants/chat/]', error);
-    }
-
-    if (!openai || !thread_id || !run_id) {
-      return sendResponse(res, messageData, defaultErrorMessage);
-    }
-
-    await sleep(2000);
-
-    try {
-      const status = await cache.get(cacheKey);
-      if (status === 'cancelled') {
-        logger.debug('[/assistants/chat/] Run already cancelled');
-        return res.end();
-      }
-      await cache.delete(cacheKey);
-      const cancelledRun = await openai.beta.threads.runs.cancel(thread_id, run_id);
-      logger.debug('[/assistants/chat/] Cancelled run:', cancelledRun);
-    } catch (error) {
-      logger.error('[/assistants/chat/] Error cancelling run', error);
-    }
-
-    await sleep(2000);
-
-    let run;
-    try {
-      run = await openai.beta.threads.runs.retrieve(thread_id, run_id);
-      await recordUsage({
-        ...run.usage,
-        model: run.model,
-        user: req.user.id,
-        conversationId,
-      });
-    } catch (error) {
-      logger.error('[/assistants/chat/] Error fetching or processing run', error);
-    }
-
-    let finalEvent;
-    try {
-      const runMessages = await checkMessageGaps({
-        openai,
-        run_id,
-        thread_id,
-        conversationId,
-        latestMessageId: responseMessageId,
-      });
-
-      const errorContentPart = {
-        text: {
-          value:
-            error?.message ?? 'There was an error processing your request. Please try again later.',
-        },
-        type: ContentTypes.ERROR,
-      };
-
-      if (!Array.isArray(runMessages[runMessages.length - 1]?.content)) {
-        runMessages[runMessages.length - 1].content = [errorContentPart];
-      } else {
-        const contentParts = runMessages[runMessages.length - 1].content;
-        for (let i = 0; i < contentParts.length; i++) {
-          const currentPart = contentParts[i];
-          /** @type {CodeToolCall | RetrievalToolCall | FunctionToolCall | undefined} */
-          const toolCall = currentPart?.[ContentTypes.TOOL_CALL];
-          if (
-            toolCall &&
-            toolCall?.function &&
-            !(toolCall?.function?.output || toolCall?.function?.output?.length)
-          ) {
-            contentParts[i] = {
-              ...currentPart,
-              [ContentTypes.TOOL_CALL]: {
-                ...toolCall,
-                function: {
-                  ...toolCall.function,
-                  output: 'error processing tool',
-                },
-              },
-            };
-          }
-        }
-        runMessages[runMessages.length - 1].content.push(errorContentPart);
-      }
-
-      finalEvent = {
-        final: true,
-        conversation: await getConvo(req.user.id, conversationId),
-        runMessages,
-      };
-    } catch (error) {
-      logger.error('[/assistants/chat/] Error finalizing error process', error);
-      return sendResponse(res, messageData, 'The Assistant run failed');
-    }
-
-    return sendResponse(res, finalEvent);
-  };
-
-  try {
-    res.on('close', async () => {
-      if (!completedRun) {
-        await handleError(new Error('Request closed'));
-      }
-    });
-
-    if (convoId && !_thread_id) {
-      completedRun = true;
-      throw new Error('Missing thread_id for existing conversation');
-    }
-
-    if (!assistant_id) {
-      completedRun = true;
-      throw new Error('Missing assistant_id');
-    }
-
-    const checkBalanceBeforeRun = async () => {
-      if (!isEnabled(process.env.CHECK_BALANCE)) {
-        return;
-      }
-      const transactions =
-        (await getTransactions({
-          user: req.user.id,
-          context: 'message',
-          conversationId,
-        })) ?? [];
-
-      const totalPreviousTokens = Math.abs(
-        transactions.reduce((acc, curr) => acc + curr.rawAmount, 0),
-      );
-
-      // TODO: make promptBuffer a config option; buffer for titles, needs buffer for system instructions
-      const promptBuffer = parentMessageId === Constants.NO_PARENT && !_thread_id ? 200 : 0;
-      // 5 is added for labels
-      let promptTokens = (await countTokens(text + (promptPrefix ?? ''))) + 5;
-      promptTokens += totalPreviousTokens + promptBuffer;
-      // Count tokens up to the current context window
-      promptTokens = Math.min(promptTokens, getModelMaxTokens(model));
-
-      await checkBalance({
-        req,
-        res,
-        txData: {
-          model,
-          user: req.user.id,
-          tokenType: 'prompt',
-          amount: promptTokens,
-        },
-      });
-    };
-
-    /** @type {{ openai: OpenAIClient }} */
-    const { openai: _openai, client } = await initializeClient({
-      req,
-      res,
-      endpointOption: req.body.endpointOption,
-      initAppClient: true,
-    });
-
-    openai = _openai;
-
-    if (previousMessages.length) {
-      parentMessageId = previousMessages[previousMessages.length - 1].messageId;
-    }
-
-    let userMessage = {
-      role: 'user',
-      content: text,
-      metadata: {
-        messageId: userMessageId,
-      },
-    };
-
-    /** @type {CreateRunBody | undefined} */
-    const body = {
-      assistant_id,
-      model,
-    };
-
-    if (promptPrefix) {
-      body.additional_instructions = promptPrefix;
-    }
-
-    if (instructions) {
-      body.instructions = instructions;
-    }
-
-    const getRequestFileIds = async () => {
-      let thread_file_ids = [];
-      if (convoId) {
-        const convo = await getConvo(req.user.id, convoId);
-        if (convo && convo.file_ids) {
-          thread_file_ids = convo.file_ids;
-        }
-      }
-
-      file_ids = files.map(({ file_id }) => file_id);
-      if (file_ids.length || thread_file_ids.length) {
-        userMessage.file_ids = file_ids;
-        attachedFileIds = new Set([...file_ids, ...thread_file_ids]);
-      }
-    };
-
-    const addVisionPrompt = async () => {
-      if (!req.body.endpointOption.attachments) {
-        return;
-      }
-
-      /** @type {MongoFile[]} */
-      const attachments = await req.body.endpointOption.attachments;
-      if (
-        attachments &&
-        attachments.every((attachment) => attachment.source === FileSources.openai)
-      ) {
-        return;
-      }
-
-      const assistant = await openai.beta.assistants.retrieve(assistant_id);
-      const visionToolIndex = assistant.tools.findIndex(
-        (tool) => tool?.function && tool?.function?.name === ImageVisionTool.function.name,
-      );
-
-      if (visionToolIndex === -1) {
-        return;
-      }
-
-      let visionMessage = {
-        role: 'user',
-        content: '',
-      };
-      const files = await client.addImageURLs(visionMessage, attachments);
-      if (!visionMessage.image_urls?.length) {
-        return;
-      }
-
-      const imageCount = visionMessage.image_urls.length;
-      const plural = imageCount > 1;
-      visionMessage.content = createVisionPrompt(plural);
-      visionMessage = formatMessage({ message: visionMessage, endpoint: EModelEndpoint.openAI });
-
-      visionPromise = openai.chat.completions.create({
-        model: 'gpt-4-vision-preview',
-        messages: [visionMessage],
-        max_tokens: 4000,
-      });
-
-      const pluralized = plural ? 's' : '';
-      body.additional_instructions = `${
-        body.additional_instructions ? `${body.additional_instructions}\n` : ''
-      }The user has uploaded ${imageCount} image${pluralized}.
-      Use the \`${ImageVisionTool.function.name}\` tool to retrieve ${
-  plural ? '' : 'a '
-}detailed text description${pluralized} for ${plural ? 'each' : 'the'} image${pluralized}.`;
-
-      return files;
-    };
-
-    const initializeThread = async () => {
-      /** @type {[ undefined | MongoFile[]]}*/
-      const [processedFiles] = await Promise.all([addVisionPrompt(), getRequestFileIds()]);
-      // TODO: may allow multiple messages to be created beforehand in a future update
-      const initThreadBody = {
-        messages: [userMessage],
-        metadata: {
-          user: req.user.id,
-          conversationId,
-        },
-      };
-
-      if (processedFiles) {
-        for (const file of processedFiles) {
-          if (file.source !== FileSources.openai) {
-            attachedFileIds.delete(file.file_id);
-            const index = file_ids.indexOf(file.file_id);
-            if (index > -1) {
-              file_ids.splice(index, 1);
-            }
-          }
-        }
-
-        userMessage.file_ids = file_ids;
-      }
-
-      const result = await initThread({ openai, body: initThreadBody, thread_id });
-      thread_id = result.thread_id;
-
-      createOnTextProgress({
-        openai,
-        conversationId,
-        userMessageId,
-        messageId: responseMessageId,
-        thread_id,
-      });
-
-      requestMessage = {
-        user: req.user.id,
-        text,
-        messageId: userMessageId,
-        parentMessageId,
-        // TODO: make sure client sends correct format for `files`, use zod
-        files,
-        file_ids,
-        conversationId,
-        isCreatedByUser: true,
-        assistant_id,
-        thread_id,
-        model: assistant_id,
-      };
-
-      previousMessages.push(requestMessage);
-
-      /* asynchronous */
-      saveUserMessage({ ...requestMessage, model });
-
-      conversation = {
-        conversationId,
-        endpoint: EModelEndpoint.assistants,
-        promptPrefix: promptPrefix,
-        instructions: instructions,
-        assistant_id,
-        // model,
-      };
-
-      if (file_ids.length) {
-        conversation.file_ids = file_ids;
-      }
-    };
-
-    const promises = [initializeThread(), checkBalanceBeforeRun()];
-    await Promise.all(promises);
-
-    const sendInitialResponse = () => {
-      sendMessage(res, {
-        sync: true,
-        conversationId,
-        // messages: previousMessages,
-        requestMessage,
-        responseMessage: {
-          user: req.user.id,
-          messageId: openai.responseMessage.messageId,
-          parentMessageId: userMessageId,
-          conversationId,
-          assistant_id,
-          thread_id,
-          model: assistant_id,
-        },
-      });
-    };
-
-    /** @type {RunResponse | typeof StreamRunManager | undefined} */
-    let response;
-
-    const processRun = async (retry = false) => {
-      if (req.app.locals[EModelEndpoint.azureOpenAI]?.assistants) {
-        body.model = openai._options.model;
-        openai.attachedFileIds = attachedFileIds;
-        openai.visionPromise = visionPromise;
-        if (retry) {
-          response = await runAssistant({
-            openai,
-            thread_id,
-            run_id,
-            in_progress: openai.in_progress,
-          });
-          return;
-        }
-
-        /* NOTE:
-         * By default, a Run will use the model and tools configuration specified in Assistant object,
-         * but you can override most of these when creating the Run for added flexibility:
-         */
-        const run = await createRun({
-          openai,
-          thread_id,
-          body,
-        });
-
-        run_id = run.id;
-        await cache.set(cacheKey, `${thread_id}:${run_id}`, ten_minutes);
-        sendInitialResponse();
-
-        // todo: retry logic
-        response = await runAssistant({ openai, thread_id, run_id });
-        return;
-      }
-
-      /** @type {{[AssistantStreamEvents.ThreadRunCreated]: (event: ThreadRunCreated) => Promise<void>}} */
-      const handlers = {
-        [AssistantStreamEvents.ThreadRunCreated]: async (event) => {
-          await cache.set(cacheKey, `${thread_id}:${event.data.id}`, ten_minutes);
-          run_id = event.data.id;
-          sendInitialResponse();
-        },
-      };
-
-      const streamRunManager = new StreamRunManager({
-        req,
-        res,
-        openai,
-        handlers,
-        thread_id,
-        visionPromise,
-        attachedFileIds,
-        responseMessage: openai.responseMessage,
-        // streamOptions: {
-
-        // },
-      });
-
-      await streamRunManager.runAssistant({
-        thread_id,
-        body,
-      });
-
-      response = streamRunManager;
-    };
-
-    await processRun();
-    logger.debug('[/assistants/chat/] response', {
-      run: response.run,
-      steps: response.steps,
-    });
-
-    if (response.run.status === RunStatus.CANCELLED) {
-      logger.debug('[/assistants/chat/] Run cancelled, handled by `abortRun`');
-      return res.end();
-    }
-
-    if (response.run.status === RunStatus.IN_PROGRESS) {
-      processRun(true);
-    }
-
-    completedRun = response.run;
-
-    /** @type {ResponseMessage} */
-    const responseMessage = {
-      ...(response.responseMessage ?? response.finalMessage),
-      parentMessageId: userMessageId,
-      conversationId,
-      user: req.user.id,
-      assistant_id,
-      thread_id,
-      model: assistant_id,
-    };
-
-    sendMessage(res, {
-      final: true,
-      conversation,
-      requestMessage: {
-        parentMessageId,
-        thread_id,
-      },
-    });
-    res.end();
-
-    await saveAssistantMessage({ ...responseMessage, model });
-
-    if (parentMessageId === Constants.NO_PARENT && !_thread_id) {
-      addTitle(req, {
-        text,
-        responseText: response.text,
-        conversationId,
-        client,
-      });
-    }
-
-    await addThreadMetadata({
-      openai,
-      thread_id,
-      messageId: responseMessage.messageId,
-      messages: response.messages,
-    });
-
-    if (!response.run.usage) {
-      await sleep(3000);
-      completedRun = await openai.beta.threads.runs.retrieve(thread_id, response.run.id);
-      if (completedRun.usage) {
-        await recordUsage({
-          ...completedRun.usage,
-          user: req.user.id,
-          model: completedRun.model ?? model,
-          conversationId,
-        });
-      }
-    } else {
-      await recordUsage({
-        ...response.run.usage,
-        user: req.user.id,
-        model: response.run.model ?? model,
-        conversationId,
-      });
-    }
-  } catch (error) {
-    await handleError(error);
-  }
-});
-
-module.exports = router;
--- a/api/server/routes/assistants/chatV1.js
+++ b/api/server/routes/assistants/chatV1.js
@ -0,0 +1,25 @@
+const express = require('express');
+
+const router = express.Router();
+const {
+  setHeaders,
+  handleAbort,
+  validateModel,
+  // validateEndpoint,
+  buildEndpointOption,
+} = require('~/server/middleware');
+const chatController = require('~/server/controllers/assistants/chatV1');
+
+router.post('/abort', handleAbort());
+
+/**
+ * @route POST /
+ * @desc Chat with an assistant
+ * @access Public
+ * @param {express.Request} req - The request object, containing the request data.
+ * @param {express.Response} res - The response object, used to send back a response.
+ * @returns {void}
+ */
+router.post('/', validateModel, buildEndpointOption, setHeaders, chatController);
+
+module.exports = router;
--- a/api/server/routes/assistants/chatV2.js
+++ b/api/server/routes/assistants/chatV2.js
@ -0,0 +1,25 @@
+const express = require('express');
+
+const router = express.Router();
+const {
+  setHeaders,
+  handleAbort,
+  validateModel,
+  // validateEndpoint,
+  buildEndpointOption,
+} = require('~/server/middleware');
+const chatController = require('~/server/controllers/assistants/chatV2');
+
+router.post('/abort', handleAbort());
+
+/**
+ * @route POST /
+ * @desc Chat with an assistant
+ * @access Public
+ * @param {express.Request} req - The request object, containing the request data.
+ * @param {express.Response} res - The response object, used to send back a response.
+ * @returns {void}
+ */
+router.post('/', validateModel, buildEndpointOption, setHeaders, chatController);
+
+module.exports = router;
--- a/api/server/routes/assistants/index.js
+++ b/api/server/routes/assistants/index.js
@ -7,16 +7,19 @@ const {
  // concurrentLimiter,
  // messageIpLimiter,
  // messageUserLimiter,
-} = require('../../middleware');
+} = require('~/server/middleware');

-const assistants = require('./assistants');
-const chat = require('./chat');
+const v1 = require('./v1');
+const chatV1 = require('./chatV1');
+const v2 = require('./v2');
+const chatV2 = require('./chatV2');

 router.use(requireJwtAuth);
 router.use(checkBan);
 router.use(uaParser);
-
-router.use('/', assistants);
-router.use('/chat', chat);
+router.use('/v1/', v1);
+router.use('/v1/chat', chatV1);
+router.use('/v2/', v2);
+router.use('/v2/chat', chatV2);

 module.exports = router;
--- a/api/server/routes/assistants/v1.js
+++ b/api/server/routes/assistants/v1.js
@ -0,0 +1,81 @@
+const multer = require('multer');
+const express = require('express');
+const controllers = require('~/server/controllers/assistants/v1');
+const actions = require('./actions');
+const tools = require('./tools');
+
+const upload = multer();
+const router = express.Router();
+
+/**
+ * Assistant actions route.
+ * @route GET|POST /assistants/actions
+ */
+router.use('/actions', actions);
+
+/**
+ * Create an assistant.
+ * @route GET /assistants/tools
+ * @returns {TPlugin[]} 200 - application/json
+ */
+router.use('/tools', tools);
+
+/**
+ * Create an assistant.
+ * @route POST /assistants
+ * @param {AssistantCreateParams} req.body - The assistant creation parameters.
+ * @returns {Assistant} 201 - success response - application/json
+ */
+router.post('/', controllers.createAssistant);
+
+/**
+ * Retrieves an assistant.
+ * @route GET /assistants/:id
+ * @param {string} req.params.id - Assistant identifier.
+ * @returns {Assistant} 200 - success response - application/json
+ */
+router.get('/:id', controllers.retrieveAssistant);
+
+/**
+ * Modifies an assistant.
+ * @route PATCH /assistants/:id
+ * @param {string} req.params.id - Assistant identifier.
+ * @param {AssistantUpdateParams} req.body - The assistant update parameters.
+ * @returns {Assistant} 200 - success response - application/json
+ */
+router.patch('/:id', controllers.patchAssistant);
+
+/**
+ * Deletes an assistant.
+ * @route DELETE /assistants/:id
+ * @param {string} req.params.id - Assistant identifier.
+ * @returns {Assistant} 200 - success response - application/json
+ */
+router.delete('/:id', controllers.deleteAssistant);
+
+/**
+ * Returns a list of assistants.
+ * @route GET /assistants
+ * @param {AssistantListParams} req.query - The assistant list parameters for pagination and sorting.
+ * @returns {AssistantListResponse} 200 - success response - application/json
+ */
+router.get('/', controllers.listAssistants);
+
+/**
+ * Returns a list of the user's assistant documents (metadata saved to database).
+ * @route GET /assistants/documents
+ * @returns {AssistantDocument[]} 200 - success response - application/json
+ */
+router.get('/documents', controllers.getAssistantDocuments);
+
+/**
+ * Uploads and updates an avatar for a specific assistant.
+ * @route POST /avatar/:assistant_id
+ * @param {string} req.params.assistant_id - The ID of the assistant.
+ * @param {Express.Multer.File} req.file - The avatar image file.
+ * @param {string} [req.body.metadata] - Optional metadata for the assistant's avatar.
+ * @returns {Object} 200 - success response - application/json
+ */
+router.post('/avatar/:assistant_id', upload.single('file'), controllers.uploadAssistantAvatar);
+
+module.exports = router;
--- a/api/server/routes/assistants/v2.js
+++ b/api/server/routes/assistants/v2.js
@ -0,0 +1,82 @@
+const multer = require('multer');
+const express = require('express');
+const v1 = require('~/server/controllers/assistants/v1');
+const v2 = require('~/server/controllers/assistants/v2');
+const actions = require('./actions');
+const tools = require('./tools');
+
+const upload = multer();
+const router = express.Router();
+
+/**
+ * Assistant actions route.
+ * @route GET|POST /assistants/actions
+ */
+router.use('/actions', actions);
+
+/**
+ * Create an assistant.
+ * @route GET /assistants/tools
+ * @returns {TPlugin[]} 200 - application/json
+ */
+router.use('/tools', tools);
+
+/**
+ * Create an assistant.
+ * @route POST /assistants
+ * @param {AssistantCreateParams} req.body - The assistant creation parameters.
+ * @returns {Assistant} 201 - success response - application/json
+ */
+router.post('/', v2.createAssistant);
+
+/**
+ * Retrieves an assistant.
+ * @route GET /assistants/:id
+ * @param {string} req.params.id - Assistant identifier.
+ * @returns {Assistant} 200 - success response - application/json
+ */
+router.get('/:id', v1.retrieveAssistant);
+
+/**
+ * Modifies an assistant.
+ * @route PATCH /assistants/:id
+ * @param {string} req.params.id - Assistant identifier.
+ * @param {AssistantUpdateParams} req.body - The assistant update parameters.
+ * @returns {Assistant} 200 - success response - application/json
+ */
+router.patch('/:id', v2.patchAssistant);
+
+/**
+ * Deletes an assistant.
+ * @route DELETE /assistants/:id
+ * @param {string} req.params.id - Assistant identifier.
+ * @returns {Assistant} 200 - success response - application/json
+ */
+router.delete('/:id', v1.deleteAssistant);
+
+/**
+ * Returns a list of assistants.
+ * @route GET /assistants
+ * @param {AssistantListParams} req.query - The assistant list parameters for pagination and sorting.
+ * @returns {AssistantListResponse} 200 - success response - application/json
+ */
+router.get('/', v1.listAssistants);
+
+/**
+ * Returns a list of the user's assistant documents (metadata saved to database).
+ * @route GET /assistants/documents
+ * @returns {AssistantDocument[]} 200 - success response - application/json
+ */
+router.get('/documents', v1.getAssistantDocuments);
+
+/**
+ * Uploads and updates an avatar for a specific assistant.
+ * @route POST /avatar/:assistant_id
+ * @param {string} req.params.assistant_id - The ID of the assistant.
+ * @param {Express.Multer.File} req.file - The avatar image file.
+ * @param {string} [req.body.metadata] - Optional metadata for the assistant's avatar.
+ * @returns {Object} 200 - success response - application/json
+ */
+router.post('/avatar/:assistant_id', upload.single('file'), v1.uploadAssistantAvatar);
+
+module.exports = router;