mirror of
https://github.com/danny-avila/LibreChat.git
synced 2025-09-21 21:50:49 +02:00

* 🧠 feat: User Memories for Conversational Context
chore: mcp typing, use `t`
WIP: first pass, Memories UI
- Added MemoryViewer component for displaying, editing, and deleting user memories.
- Integrated data provider hooks for fetching, updating, and deleting memories.
- Implemented pagination and loading states for better user experience.
- Created unit tests for MemoryViewer to ensure functionality and interaction with data provider.
- Updated translation files to include new UI strings related to memories.
chore: move mcp-related files to own directory
chore: rename librechat-mcp to librechat-api
WIP: first pass, memory processing and data schemas
chore: linting in fileSearch.js query description
chore: rename librechat-api to @librechat/api across the project
WIP: first pass, functional memory agent
feat: add MemoryEditDialog and MemoryViewer components for managing user memories
- Introduced MemoryEditDialog for editing memory entries with validation and toast notifications.
- Updated MemoryViewer to support editing and deleting memories, including pagination and loading states.
- Enhanced data provider to handle memory updates with optional original key for better management.
- Added new localization strings for memory-related UI elements.
feat: add memory permissions management
- Implemented memory permissions in the backend, allowing roles to have specific permissions for using, creating, updating, and reading memories.
- Added new API endpoints for updating memory permissions associated with roles.
- Created a new AdminSettings component for managing memory permissions in the frontend.
- Integrated memory permissions into the existing roles and permissions schemas.
- Updated the interface to include memory settings and permissions.
- Enhanced the MemoryViewer component to conditionally render admin settings based on user roles.
- Added localization support for memory permissions in the translation files.
feat: move AdminSettings component to a new position in MemoryViewer for better visibility
refactor: clean up commented code in MemoryViewer component
feat: enhance MemoryViewer with search functionality and improve MemoryEditDialog integration
- Added a search input to filter memories in the MemoryViewer component.
- Refactored MemoryEditDialog to accept children for better customization.
- Updated MemoryViewer to utilize the new EditMemoryButton and DeleteMemoryButton components for editing and deleting memories.
- Improved localization support by adding new strings for memory filtering and deletion confirmation.
refactor: optimize memory filtering in MemoryViewer using match-sorter
- Replaced manual filtering logic with match-sorter for improved search functionality.
- Enhanced performance and readability of the filteredMemories computation.
feat: enhance MemoryEditDialog with triggerRef and improve updateMemory mutation handling
feat: implement access control for MemoryEditDialog and MemoryViewer components
refactor: remove commented out code and create runMemory method
refactor: rename role based files
feat: implement access control for memory usage in AgentClient
refactor: simplify checkVisionRequest method in AgentClient by removing commented-out code
refactor: make `agents` dir in api package
refactor: migrate Azure utilities to TypeScript and consolidate imports
refactor: move sanitizeFilename function to a new file and update imports, add related tests
refactor: update LLM configuration types and consolidate Azure options in the API package
chore: linting
chore: import order
refactor: replace getLLMConfig with getOpenAIConfig and remove unused LLM configuration file
chore: update winston-daily-rotate-file to version 5.0.0 and add object-hash dependency in package-lock.json
refactor: move primeResources and optionalChainWithEmptyCheck functions to resources.ts and update imports
refactor: move createRun function to a new run.ts file and update related imports
fix: ensure safeAttachments is correctly typed as an array of TFile
chore: add node-fetch dependency and refactor fetch-related functions into packages/api/utils, removing the old generators file
refactor: enhance TEndpointOption type by using Pick to streamline endpoint fields and add new properties for model parameters and client options
feat: implement initializeOpenAIOptions function and update OpenAI types for enhanced configuration handling
fix: update types due to new TEndpointOption typing
fix: ensure safe access to group parameters in initializeOpenAIOptions function
fix: remove redundant API key validation comment in initializeOpenAIOptions function
refactor: rename initializeOpenAIOptions to initializeOpenAI for consistency and update related documentation
refactor: decouple req.body fields and tool loading from initializeAgentOptions
chore: linting
refactor: adjust column widths in MemoryViewer for improved layout
refactor: simplify agent initialization by creating loadAgent function and removing unused code
feat: add memory configuration loading and validation functions
WIP: first pass, memory processing with config
feat: implement memory callback and artifact handling
feat: implement memory artifacts display and processing updates
feat: add memory configuration options and schema validation for validKeys
fix: update MemoryEditDialog and MemoryViewer to handle memory state and display improvements
refactor: remove padding from BookmarkTable and MemoryViewer headers for consistent styling
WIP: initial tokenLimit config and move Tokenizer to @librechat/api
refactor: update mongoMeili plugin methods to use callback for better error handling
feat: enhance memory management with token tracking and usage metrics
- Added token counting for memory entries to enforce limits and provide usage statistics.
- Updated memory retrieval and update routes to include total token usage and limit.
- Enhanced MemoryEditDialog and MemoryViewer components to display memory usage and token information.
- Refactored memory processing functions to handle token limits and provide feedback on memory capacity.
feat: implement memory artifact handling in attachment handler
- Enhanced useAttachmentHandler to process memory artifacts when receiving updates.
- Introduced handleMemoryArtifact utility to manage memory updates and deletions.
- Updated query client to reflect changes in memory state based on incoming data.
refactor: restructure web search key extraction logic
- Moved the logic for extracting API keys from the webSearchAuth configuration into a dedicated function, getWebSearchKeys.
- Updated webSearchKeys to utilize the new function for improved clarity and maintainability.
- Prevents build time errors
feat: add personalization settings and memory preferences management
- Introduced a new Personalization tab in settings to manage user memory preferences.
- Implemented API endpoints and client-side logic for updating memory preferences.
- Enhanced user interface components to reflect personalization options and memory usage.
- Updated permissions to allow users to opt out of memory features.
- Added localization support for new settings and messages related to personalization.
style: personalization switch class
feat: add PersonalizationIcon and align Side Panel UI
feat: implement memory creation functionality
- Added a new API endpoint for creating memory entries, including validation for key and value.
- Introduced MemoryCreateDialog component for user interface to facilitate memory creation.
- Integrated token limit checks to prevent exceeding user memory capacity.
- Updated MemoryViewer to include a button for opening the memory creation dialog.
- Enhanced localization support for new messages related to memory creation.
feat: enhance message processing with configurable window size
- Updated AgentClient to use a configurable message window size for processing messages.
- Introduced messageWindowSize option in memory configuration schema with a default value of 5.
- Improved logic for selecting messages to process based on the configured window size.
chore: update librechat-data-provider version to 0.7.87 in package.json and package-lock.json
chore: remove OpenAPIPlugin and its associated tests
chore: remove MIGRATION_README.md as migration tasks are completed
ci: fix backend tests
chore: remove unused translation keys from localization file
chore: remove problematic test file and unused var in AgentClient
chore: remove unused import and import directly for JSDoc
* feat: add api package build stage in Dockerfile for improved modularity
* docs: reorder build steps in contributing guide for clarity
228 lines
9.5 KiB
JavaScript
228 lines
9.5 KiB
JavaScript
const { z } = require('zod');
|
|
const path = require('path');
|
|
const OpenAI = require('openai');
|
|
const fetch = require('node-fetch');
|
|
const { v4: uuidv4 } = require('uuid');
|
|
const { Tool } = require('@langchain/core/tools');
|
|
const { HttpsProxyAgent } = require('https-proxy-agent');
|
|
const { FileContext, ContentTypes } = require('librechat-data-provider');
|
|
const { getImageBasename } = require('~/server/services/Files/images');
|
|
const extractBaseURL = require('~/utils/extractBaseURL');
|
|
const logger = require('~/config/winston');
|
|
|
|
const displayMessage =
|
|
"DALL-E displayed an image. All generated images are already plainly visible, so don't repeat the descriptions in detail. Do not list download links as they are available in the UI already. The user may download the images by clicking on them, but do not mention anything about downloading to the user.";
|
|
class DALLE3 extends Tool {
|
|
constructor(fields = {}) {
|
|
super();
|
|
/** @type {boolean} Used to initialize the Tool without necessary variables. */
|
|
this.override = fields.override ?? false;
|
|
/** @type {boolean} Necessary for output to contain all image metadata. */
|
|
this.returnMetadata = fields.returnMetadata ?? false;
|
|
|
|
this.userId = fields.userId;
|
|
this.fileStrategy = fields.fileStrategy;
|
|
/** @type {boolean} */
|
|
this.isAgent = fields.isAgent;
|
|
if (fields.processFileURL) {
|
|
/** @type {processFileURL} Necessary for output to contain all image metadata. */
|
|
this.processFileURL = fields.processFileURL.bind(this);
|
|
}
|
|
|
|
let apiKey = fields.DALLE3_API_KEY ?? fields.DALLE_API_KEY ?? this.getApiKey();
|
|
const config = { apiKey };
|
|
if (process.env.DALLE_REVERSE_PROXY) {
|
|
config.baseURL = extractBaseURL(process.env.DALLE_REVERSE_PROXY);
|
|
}
|
|
|
|
if (process.env.DALLE3_AZURE_API_VERSION && process.env.DALLE3_BASEURL) {
|
|
config.baseURL = process.env.DALLE3_BASEURL;
|
|
config.defaultQuery = { 'api-version': process.env.DALLE3_AZURE_API_VERSION };
|
|
config.defaultHeaders = {
|
|
'api-key': process.env.DALLE3_API_KEY,
|
|
'Content-Type': 'application/json',
|
|
};
|
|
config.apiKey = process.env.DALLE3_API_KEY;
|
|
}
|
|
|
|
if (process.env.PROXY) {
|
|
config.httpAgent = new HttpsProxyAgent(process.env.PROXY);
|
|
}
|
|
|
|
/** @type {OpenAI} */
|
|
this.openai = new OpenAI(config);
|
|
this.name = 'dalle';
|
|
this.description = `Use DALLE to create images from text descriptions.
|
|
- It requires prompts to be in English, detailed, and to specify image type and human features for diversity.
|
|
- Create only one image, without repeating or listing descriptions outside the "prompts" field.
|
|
- Maintains the original intent of the description, with parameters for image style, quality, and size to tailor the output.`;
|
|
this.description_for_model =
|
|
process.env.DALLE3_SYSTEM_PROMPT ??
|
|
`// Whenever a description of an image is given, generate prompts (following these rules), and use dalle to create the image. If the user does not ask for a specific number of images, default to creating 2 prompts to send to dalle that are written to be as diverse as possible. All prompts sent to dalle must abide by the following policies:
|
|
// 1. Prompts must be in English. Translate to English if needed.
|
|
// 2. One image per function call. Create only 1 image per request unless explicitly told to generate more than 1 image.
|
|
// 3. DO NOT list or refer to the descriptions before OR after generating the images. They should ONLY ever be written out ONCE, in the \`"prompts"\` field of the request. You do not need to ask for permission to generate, just do it!
|
|
// 4. Always mention the image type (photo, oil painting, watercolor painting, illustration, cartoon, drawing, vector, render, etc.) at the beginning of the caption. Unless the captions suggests otherwise, make one of the images a photo.
|
|
// 5. Diversify depictions of ALL images with people to always include always DESCENT and GENDER for EACH person using direct terms. Adjust only human descriptions.
|
|
// - EXPLICITLY specify these attributes, not abstractly reference them. The attributes should be specified in a minimal way and should directly describe their physical form.
|
|
// - Your choices should be grounded in reality. For example, all of a given OCCUPATION should not be the same gender or race. Additionally, focus on creating diverse, inclusive, and exploratory scenes via the properties you choose during rewrites. Make choices that may be insightful or unique sometimes.
|
|
// - Use "various" or "diverse" ONLY IF the description refers to groups of more than 3 people. Do not change the number of people requested in the original description.
|
|
// - Don't alter memes, fictional character origins, or unseen people. Maintain the original prompt's intent and prioritize quality.
|
|
// The prompt must intricately describe every part of the image in concrete, objective detail. THINK about what the end goal of the description is, and extrapolate that to what would make satisfying images.
|
|
// All descriptions sent to dalle should be a paragraph of text that is extremely descriptive and detailed. Each should be more than 3 sentences long.
|
|
// - The "vivid" style is HIGHLY preferred, but "natural" is also supported.`;
|
|
this.schema = z.object({
|
|
prompt: z
|
|
.string()
|
|
.max(4000)
|
|
.describe(
|
|
'A text description of the desired image, following the rules, up to 4000 characters.',
|
|
),
|
|
style: z
|
|
.enum(['vivid', 'natural'])
|
|
.describe(
|
|
'Must be one of `vivid` or `natural`. `vivid` generates hyper-real and dramatic images, `natural` produces more natural, less hyper-real looking images',
|
|
),
|
|
quality: z
|
|
.enum(['hd', 'standard'])
|
|
.describe('The quality of the generated image. Only `hd` and `standard` are supported.'),
|
|
size: z
|
|
.enum(['1024x1024', '1792x1024', '1024x1792'])
|
|
.describe(
|
|
'The size of the requested image. Use 1024x1024 (square) as the default, 1792x1024 if the user requests a wide image, and 1024x1792 for full-body portraits. Always include this parameter in the request.',
|
|
),
|
|
});
|
|
}
|
|
|
|
getApiKey() {
|
|
const apiKey = process.env.DALLE3_API_KEY ?? process.env.DALLE_API_KEY ?? '';
|
|
if (!apiKey && !this.override) {
|
|
throw new Error('Missing DALLE_API_KEY environment variable.');
|
|
}
|
|
return apiKey;
|
|
}
|
|
|
|
replaceUnwantedChars(inputString) {
|
|
return inputString
|
|
.replace(/\r\n|\r|\n/g, ' ')
|
|
.replace(/"/g, '')
|
|
.trim();
|
|
}
|
|
|
|
wrapInMarkdown(imageUrl) {
|
|
return ``;
|
|
}
|
|
|
|
returnValue(value) {
|
|
if (this.isAgent === true && typeof value === 'string') {
|
|
return [value, {}];
|
|
} else if (this.isAgent === true && typeof value === 'object') {
|
|
return [displayMessage, value];
|
|
}
|
|
|
|
return value;
|
|
}
|
|
|
|
async _call(data) {
|
|
const { prompt, quality = 'standard', size = '1024x1024', style = 'vivid' } = data;
|
|
if (!prompt) {
|
|
throw new Error('Missing required field: prompt');
|
|
}
|
|
|
|
let resp;
|
|
try {
|
|
resp = await this.openai.images.generate({
|
|
model: 'dall-e-3',
|
|
quality,
|
|
style,
|
|
size,
|
|
prompt: this.replaceUnwantedChars(prompt),
|
|
n: 1,
|
|
});
|
|
} catch (error) {
|
|
logger.error('[DALL-E-3] Problem generating the image:', error);
|
|
return this
|
|
.returnValue(`Something went wrong when trying to generate the image. The DALL-E API may be unavailable:
|
|
Error Message: ${error.message}`);
|
|
}
|
|
|
|
if (!resp) {
|
|
return this.returnValue(
|
|
'Something went wrong when trying to generate the image. The DALL-E API may be unavailable',
|
|
);
|
|
}
|
|
|
|
const theImageUrl = resp.data[0].url;
|
|
|
|
if (!theImageUrl) {
|
|
return this.returnValue(
|
|
'No image URL returned from OpenAI API. There may be a problem with the API or your configuration.',
|
|
);
|
|
}
|
|
|
|
if (this.isAgent) {
|
|
let fetchOptions = {};
|
|
if (process.env.PROXY) {
|
|
fetchOptions.agent = new HttpsProxyAgent(process.env.PROXY);
|
|
}
|
|
const imageResponse = await fetch(theImageUrl, fetchOptions);
|
|
const arrayBuffer = await imageResponse.arrayBuffer();
|
|
const base64 = Buffer.from(arrayBuffer).toString('base64');
|
|
const content = [
|
|
{
|
|
type: ContentTypes.IMAGE_URL,
|
|
image_url: {
|
|
url: `data:image/png;base64,${base64}`,
|
|
},
|
|
},
|
|
];
|
|
|
|
const response = [
|
|
{
|
|
type: ContentTypes.TEXT,
|
|
text: displayMessage,
|
|
},
|
|
];
|
|
return [response, { content }];
|
|
}
|
|
|
|
const imageBasename = getImageBasename(theImageUrl);
|
|
const imageExt = path.extname(imageBasename);
|
|
|
|
const extension = imageExt.startsWith('.') ? imageExt.slice(1) : imageExt;
|
|
const imageName = `img-${uuidv4()}.${extension}`;
|
|
|
|
logger.debug('[DALL-E-3]', {
|
|
imageName,
|
|
imageBasename,
|
|
imageExt,
|
|
extension,
|
|
theImageUrl,
|
|
data: resp.data[0],
|
|
});
|
|
|
|
try {
|
|
const result = await this.processFileURL({
|
|
URL: theImageUrl,
|
|
basePath: 'images',
|
|
userId: this.userId,
|
|
fileName: imageName,
|
|
fileStrategy: this.fileStrategy,
|
|
context: FileContext.image_generation,
|
|
});
|
|
|
|
if (this.returnMetadata) {
|
|
this.result = result;
|
|
} else {
|
|
this.result = this.wrapInMarkdown(result.filepath);
|
|
}
|
|
} catch (error) {
|
|
logger.error('Error while saving the image:', error);
|
|
this.result = `Failed to save the image locally. ${error.message}`;
|
|
}
|
|
|
|
return this.returnValue(this.result);
|
|
}
|
|
}
|
|
|
|
module.exports = DALLE3;
|