👁️ feat: Azure Mistral OCR Strategy (#7888)

* 👁️ feat: Add Azure Mistral OCR strategy and endpoint integration

This commit introduces a new OCR strategy named 'azure_mistral_ocr', allowing the use of a Mistral OCR endpoint deployed on Azure. The configuration, schemas, and file upload strategies have been updated to support this integration, enabling seamless OCR processing via Azure-hosted Mistral services.

* 🗑️ chore: Clean up .gitignore by removing commented-out uncommon directory name

* chore: remove unused vars

* refactor: Move createAxiosInstance to packages/api/utils and update imports

- Removed the createAxiosInstance function from the config module and relocated it to a new utils module for better organization.
- Updated import paths in relevant files to reflect the new location of createAxiosInstance.
- Added tests for createAxiosInstance to ensure proper functionality and proxy configuration handling.

* chore: move axios helpers to packages/api

- Added logAxiosError function to @librechat/api for centralized error logging.
- Updated imports across various files to use the new logAxiosError function.
- Removed the old axios.js utility file as it is no longer needed.

* chore: Update Jest moduleNameMapper for improved path resolution

- Added a new mapping for '~/' to resolve module paths in Jest configuration, enhancing import handling for the project.

* feat: Implement Mistral OCR API integration in TS

* chore: Update MistralOCR tests based on new imports

* fix: Enhance MistralOCR configuration handling and tests

- Introduced helper functions for resolving configuration values from environment variables or hardcoded settings.
- Updated the uploadMistralOCR and uploadAzureMistralOCR functions to utilize the new configuration resolution logic.
- Improved test cases to ensure correct behavior when mixing environment variables and hardcoded values.
- Mocked file upload and signed URL responses in tests to validate functionality without external dependencies.

* feat: Enhance MistralOCR functionality with improved configuration and error handling

- Introduced helper functions for loading authentication configuration and resolving values from environment variables.
- Updated uploadMistralOCR and uploadAzureMistralOCR functions to utilize the new configuration logic.
- Added utility functions for processing OCR results and creating error messages.
- Improved document type determination and result aggregation for better OCR processing.

* refactor: Reorganize OCR type imports in Mistral CRUD file

- Moved OCRResult, OCRResultPage, and OCRImage imports to a more logical grouping for better readability and maintainability.

* feat: Add file exports to API and create files index

* chore: Update OCR types for enhanced structure and clarity

- Redesigned OCRImage interface to include mandatory fields and improved naming conventions.
- Added PageDimensions interface for better representation of page metrics.
- Updated OCRResultPage to include dimensions and mandatory images array.
- Refined OCRResult to include document annotation and usage information.

* refactor: use TS counterpart of uploadOCR methods

* ci: Update MistralOCR tests to reflect new OCR result structure

* chore: Bump version of @librechat/api to 1.2.3 in package.json and package-lock.json

* chore: Update CONFIG_VERSION to 1.2.8

* chore: remove unused sendEvent function from config module (now imported from '@librechat/api')

* chore: remove MistralOCR service files and tests (now in '@librechat/api')

* ci: update logger import in ModelService tests to use @librechat/data-schemas

---------

Co-authored-by: arthurolivierfortin <arthurolivier.fortin@gmail.com>
This commit is contained in:
Danny Avila 2025-06-13 15:14:57 -04:00 committed by GitHub
parent 46ff008b07
commit 5f2d1c5dc9
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
37 changed files with 2245 additions and 1235 deletions

View file

@ -1,10 +1,11 @@
const { z } = require('zod');
const axios = require('axios');
const { Ollama } = require('ollama');
const { sleep } = require('@librechat/agents');
const { logAxiosError } = require('@librechat/api');
const { logger } = require('@librechat/data-schemas');
const { Constants } = require('librechat-data-provider');
const { deriveBaseURL, logAxiosError } = require('~/utils');
const { sleep } = require('~/server/utils');
const { logger } = require('~/config');
const { deriveBaseURL } = require('~/utils');
const ollamaPayloadSchema = z.object({
mirostat: z.number().optional(),
@ -67,7 +68,7 @@ class OllamaClient {
return models;
} catch (error) {
const logMessage =
'Failed to fetch models from Ollama API. If you are not using Ollama directly, and instead, through some aggregator or reverse proxy that handles fetching via OpenAI spec, ensure the name of the endpoint doesn\'t start with `ollama` (case-insensitive).';
"Failed to fetch models from Ollama API. If you are not using Ollama directly, and instead, through some aggregator or reverse proxy that handles fetching via OpenAI spec, ensure the name of the endpoint doesn't start with `ollama` (case-insensitive).";
logAxiosError({ message: logMessage, error });
return [];
}

View file

@ -4,12 +4,13 @@ const { v4 } = require('uuid');
const OpenAI = require('openai');
const FormData = require('form-data');
const { tool } = require('@langchain/core/tools');
const { logAxiosError } = require('@librechat/api');
const { logger } = require('@librechat/data-schemas');
const { HttpsProxyAgent } = require('https-proxy-agent');
const { ContentTypes, EImageOutputType } = require('librechat-data-provider');
const { getStrategyFunctions } = require('~/server/services/Files/strategies');
const { logAxiosError, extractBaseURL } = require('~/utils');
const { extractBaseURL } = require('~/utils');
const { getFiles } = require('~/models/File');
const { logger } = require('~/config');
/** Default descriptions for image generation tool */
const DEFAULT_IMAGE_GEN_DESCRIPTION = `